How can we improve Microsoft Azure Data Lake?

Support Event Hubs as stream type data input

Its easy to store stream data in Azure to skip many times data copy operation in each Azure Services. This is beautiful story to realize lambda-architecture in Azure.

25 votes
Sign in
(thinking…)
Password icon
Signed in as (Sign out)

We’ll send you updates on this idea

Daiyu Hatakeyama shared this idea  ·   ·  Flag idea as inappropriate…  ·  Admin →

4 comments

Sign in
(thinking…)
Password icon
Signed in as (Sign out)
Submitting...
  • Mayo commented  ·   ·  Flag as inappropriate

    Yes, please! This use case: Event Hubs Archive --Data Factory--> Data Lake Store <-- U-SQL ingest <--Scheduler is vital. Right now there are mucho blockers on the adl-a side, with support for Avro and the empty Avro files (file header, no blocks) generated by event hub capture.

    Event better, it would be a wow moment to extract this data (from avro) into a table and automatically keep it up to date when new files arrive.

  • Iain commented  ·   ·  Flag as inappropriate

    Are you thinking, a one-step version of

    Event Hubs Archive --Data Factory--> Data Lake Store <-- U-SQL ingest <--Scheduler

    Now that I write it out like that, your suggestion does sound very useful..!

  • Nick Darvey commented  ·   ·  Flag as inappropriate

    It'd be highly valuable to us to be able to pour messages coming from IOT Hub into Data Lake

  • Sachin C Sheth commented  ·   ·  Flag as inappropriate

    Hi,

    Thanks for your suggestions. Had a few clarifying questions, so that we can understand your requirements better.

    What is your description of a lambda architecture please? My understanding is that this done currently by having multiple readers forking data off from the message broker (EventHubs, Kafka etc.) to support a cold path and a hot path. How will supporting EventHubs as a stream input data type in Azure Data Lake make lambda architectures better?

    Also, you indicated that you are data copy operations many times. Can you please explain where do you that and what stores do you copy to and what processing you do on each copy?

    Thanks,
    Sachin Sheth
    Program Manager
    Azure Data Lake

Feedback and Knowledge Base