Update Service Bus SLA to include a maximum delay for delivering messages
Today the SLA of 99.95% is too weak and doesn't include any maximum delay for delivering messages to consumers.
Because of an unhealthy partition I experienced getting several messages hours later than their ScheduledEnqueueTimeUtc suggested. I experienced several messages being consumed 8 hours later and creating chaos in the middle of the night (sending thousands of sms text messages in my case).
Here's the actual description of "downtime" in the Service Bus SLA:
"Downtime" is the total accumulated Deployment Minutes, across all Queues and Topics deployed by Customer in a given Microsoft Azure subscription, during which the Queue or Topic is unavailable. A minute is considered unavailable for a given Queue or Topic if all continuous attempts to send or receive Messages or perform other operations on the Queue or Topic throughout the minute either return an Error Code or do not result in a Success Code within five minutes.
There's not a single mention of how long messages can be delayed. In theory, it seems you could never receive it and still the SLA wouldn't have been broken.
This is extremely important as the Service Bus should be a reliable service that glues other componentes in your cloud architecture together. This is, by it's very definition, the most central component in a distributed cloud architecture. I can't see how Service Bus can be considered "reliable" without an SLA that states response times clearly.