Remove cloud to device time-to-live limit of 48 hours
We have scenarios where it is valid that a device may not connect in any arbitrary 48 hour period, such as a stationary vehicle, or a specialised machine that is not in use for a long period.
By having the command messages expiring, we have to monitor expiries and resend the message every 48 hours, if we want to use IoT hub. Or, we could bypass IoT hub and use service bus (which is odd, since IoT hub is built on service bus), or implement some other (http) service that devices can call - both of which mean that we can't rely on IoT hub security for cloud-to-device messages.
I get that there is potentially a resource usage issue (millions of messages that sit around for ages), but in our cases we send very few cloud-to-device messages - maybe 1 message per month per device. Service bus has (I believe) unlimited ttl, with a maximum queue size. We're okay with limiting the number of long-living messages, just not an arbitrary ttl number. Why chose 48 hours anyway? A machine in a factory switched off on a Friday afternoon will be off for 60 hours before being switched on on Monday, so 48 hours doesn't even cover a normal weekend.
We suggest to use device twin to handle configuration updates.
See https://docs.microsoft.com/en-us/azure/iot-hub/iot-hub-devguide-c2d-guidance for more information.
We recently delivered the device twin functionality that should work for your scenario.
Hi Simon, Joby,
We are always listening to feedback on our quotas, and open to change them (when possible) if this affects many customers and possible customers. The main issue is that any new limit we define, would still affect some solutions. What limit would work for your cases?
We are also aware that there are some scenarios for which C2D messages are ill-suited (e.g. updating configurations on devices, and long-running workflows), and working on new capabilities to address those in a batter way.
It would help to understand the specific scenarios you are using these messages for.
Simon Munro commented
I have to disagree with you Elio.
Redefining our problem of, as I see it, high-latency communication, into long-running workflows doesn't address the issue. I completely understand that you have made design decisions to make IoT Hub work the way that it does - in that case it is a (fair) limit on IoT hub rather than poor design of our systems.
I have run into another example with a customer. They have low-powered devices that are fitted to farm gates and only connect when the gate is used. Do I tell the client that no, devices cannot run on a single charge for years, because they need to communicate every 48 hours? Or do I send a new message to the device every 48 hours for weeks until it connects because Microsoft has defined it as a long-running workflow? I will do neither. I will store messages in a 'reliable' store like blob storage and write a service on top of that - completely bypassing IoT Hub. As per Joby's observation, since that requires different authentication for C2D and D2C, it makes sense to bypass IoT hub altogether and use my own service for incoming and outgoing communications.
Again, I respect the decisions that you have made on IoT Hub to provide really good IoT service. But, that may be at odds with certain scenarios, and IoT Hub may not work in all cases. Surely the idea of UserVoice is to provide a platform for us to let you know about the scenarios that we encounter in the field, where they can be considered for everyone's benefit, rather than have the product team tell us that our own interpretation of our own problems is incorrect.
Joby Joseph commented
I second Simon's opinion as we are also facing the same challenge. We have machines in mines which cannot come online very often and we can't afford to send people to mines just for making configuration changes to the machines. In my opinion, this is a very valid use case. Of cource, we can solve it by other means, but the solution will be more complicated as the devices will have to deal with multiple authentication and security mechanisms.
IoT Hub messaging was designed to provide reliable communications. As such, the TTL is designed to overcome connectivity issues and not to assist in workflows (such as an enterprise messaging system like Service Bus). IoT Hub has to avoid keeping messages indefinitely as it should not be the source of truth on the status of your long running workflows. The 48 hours limit is derived from a notion than any flow running for more than a day is considered "long running".
In case you have a long running workflow, the recommended route is to store the status of this workflow in the cloud (in SB or DocDB for instance), and use IoT Hub just for communications. IoT Hub provides a feedback queue for C2D messages that provides information on whether messages are successfully received by the device or expired. This feedback help drive your long running workflow in the cloud and deciding if the message is worth to resend or not.