Infer timestamps from blob filenames instead of metadata. I just bulk uploaded my back data in order to test (one blob per day of data, with YYYY-MM-DD names). But ASA appears to be using the blob LastModified timestamp property to determine whether the data should be included or not. The problem is that the LastModified dates ended up out of order, and worse, are from today, not from the previous months from when the data was collected originally.
Hi, Azure Stream Analytics is optimized for streams of data coming in real time. However we can use some workaround for processing "historical data" coming to blobs.
By default, Azure Stream Analytics uses the arrival time as timestamp. For blob, you are right this is the LastModified date. However using the keyword TIMESTAMP BY you can choose another timestamp from your payload.
Note that you will need to extend the late arrival policy in the options so the difference between the actual timestamp and the arrival time is less than the maximum "late arrival time".
Late arrival can be disabled totally to support reprocessing of historical data by setting the option to "-1", however this is not possible to do this in the portal and you will need to do this either using the API, and Visual Studio (e.g. export the job, edit the number, and republish the job).
Let us know if you have further question or need additional help.
JS (Azure Stream Analytics)
PS: See more info on time policies here: https://docs.microsoft.com/en-us/azure/stream-analytics/stream-analytics-time-handling