Take input from CosmosDB
Currently Stream Analytics accepts data from Blob storage. Support to get data from DocumentDB should be added.
Cosmos DB should be a source of reference data, but the Cosmos DB Change Feed should be valid input to a Stream Analytics job as well. Solve different use cases, both equally necessary.
Hi, Microsoft is increasingly pushing for CosmosDB to be a way of unifying streaming and batch data, so being able to connect a ASA job to the Change Feed of a collection is crucial to make that vision real. It can then also be used to very easily add aggregates to your cosmosdb (in a different collection, by using the existing cosmosdb output)
Siva Muthi commented
This can be a great add-in to the stream analytics
Prafullakumar Surve commented
It would solve most of our analytics needs. Since all our data is in documentDB, we now have to create pipelines using Azure Data Factory before we can process it via Stream Analytics. That data transfer is avoided if this feature is enabled.
Reference data from DocDB would be very nice!
Ian Bennett commented
Perhaps the change feed functionality in DocumentDB brings this closer to reality?
I think that suggestions like this reflect usage of Stream Analytics as a basic ELT tool within the Azure environment. This is understandable as it is easy to setup and use and there are not many alternatives without having to setup expensive VMs to use SSIS or the like - and even then older batch oriented tools like that typically do not handle streaming well.
Would really like to see this feature driven forward. We are updating a project from SQL to DocDB and would like to drive Stream Analytics from the data store.
@Azure Stream Analytics Team, definitely only use it as a ref data store. Like the request for ref data from SQL having ref data readable from DocumentDB would be great. Using ADF (which is an expensive service) to move data from the system of record to another system just so that Stream Analytics can read it doesn't seem like a good use of resources.
Riccardo Zamana commented
I Agree with this purpose.
Often we can get documentBased data (various registries in which you can find associations between entities and post processing jobs, or another example: POI registry as input and gps positions are the eventhub).
Thank you for your feedback. Could you help to clarify your scenario a bit? How would you like to use input from DocDB? Are you looking for the data to be used as reference data within streaming queries or a data stream? As DocDB is a document database it is not generally a source for streaming input, in contrast Blobs can be used in a way that mimics a stream (although not in real-time).