Enable indefinite storage of events on Event Hub to allow for an Event Sourcing architecture.
When building an application using Event Sourcing, the events are the true source of all data. Replaying events and creating different projections at any time during the lifetime of the app is crucial. Currently Event Hubs only supports up to a 7 day storage policy. This means that it cannot be used with Event Sourcing. An example of this kind product is Greg Young's EventStore. https://geteventstore.com/
>>Realistically, if you don't snapshot your ES periodically, you can never again catch up, so while the theory of "raw data forever" sounds good, nontrivial event volumes make it hardly workable.
What if our volumes are not so big, and we just want events driven & event sourcing based approach. So far this is big limitation in the event hubs. We just don't want to do extra management and if needed just want to be able to rewind the stream to the beginning to reprocess all the events. Imaging continues machine learning scenarios. Or ton's of others. Like when I add new Azure Function and want it to be called for all events from the beginning.
Daniel Turan commented
I think Event Hub and Event Store serves different purpose, although the might be complementary.
I agree, a very common pattern for the adoption of an event-based paradigm involves not only consuming current messages in the Event Hub, but as a new "subscriber" I need to ask for all historical messages to "catch up" to the current stream. Having a way to persist historical messages by ID (last write wins) via the Archive feature or something similar would be very helpful.
Per OEkvist commented
How about "Log Compaction" ? Gives you a sort of snap shot, instead of indefinite retention .
Any plans for Event hubs to support log compaction ?
Clemens Vasters commented
Event Hub Archive enables indefinite event storage. You can retrieve raw archived events from Avro containers in Blob store.
Realistically, if you don't snapshot your ES periodically, you can never again catch up, so while the theory of "raw data forever" sounds good, nontrivial event volumes make it hardly workable.