How can we improve Microsoft Azure Data Lake?

Query Windows Azure Storage Table in Azure Data Lake Analytics U-SQL

Support Windows Azure Storage Table queries as external tables.

We wish to be able to run a U-SQL query over source data in WAST without having to first extract it to flat files in ADLS. We use WAST as our transactional data store and then need to perform aggregations over this data to produce summary outputs for which ADLA would be ideal, but the preferably without the overhead of unnecessary data movement.

25 votes
Sign in
(thinking…)
Sign in with: Microsoft
Signed in as (Sign out)

We’ll send you updates on this idea

Darran Shepherd shared this idea  ·   ·  Flag idea as inappropriate…  ·  Admin →

5 comments

Sign in
(thinking…)
Sign in with: Microsoft
Signed in as (Sign out)
Submitting...
  • Alex commented  ·   ·  Flag as inappropriate

    For starters it would be great to just be able to push data to Azure Table Storage.
    Since both sides are schematized, this should be straight forward.

    Data Factory can help us in the mean time, but is extremely painful to use.

  • Alex commented  ·   ·  Flag as inappropriate

    Ideally support for WAST would be bi-directional.
    In that we could query and write to table storage.

    Fixed schema as v1 support would be fine.

  • Darran Shepherd commented  ·   ·  Flag as inappropriate

    Hi Michael,

    Sorry for not responding sooner.

    For our particular use case, we would be happy with a fixed schema across the columns of the table. We would not particularly require a column set abstraction, although I can see how it might be useful for people taking more advantage of the dynamic nature of WAST.

    We would most definitely want to be able to push filters into the table engine to extract only the relevant data for the current query. Ideally (not sure this is possible), ADLA would be able to use the partitioning scheme of the table to help inform the parallelisation of the U-SQL query.

  • Michael Rys commented  ·   ·  Flag as inappropriate

    Dear Darran (and others who vote this item up)

    Thanks for the request.

    Azure tables (unlike Azure Blob Stores) has a semistructured data model (key-column set) and a simple query language. Thus we would need to better understand your scenario requirements for scoping the work:

    A) Are you interested in/ok with exposing a structured view over a fixed set of columns (filling in nulls for columns that are not there in a set, giving a mapping into a MAP<string,string> for the flexible columns)?
    B) Are you interested in having us provide a column set abstraction in U-SQL in order to be providing similar flexibility?
    C) Are you interested in us pushing filters into the Azure Table engine?

    Thanks!

Feedback and Knowledge Base