Data Lake

You can use this set to communicate with the Azure Data Lake team. We are eager to hear your ideas, suggestions, or any other feedback that would help us improve the service to bet fit your needs.

If you have technical questions, please visit our forums.
If you are looking for tutorials and documentation, please visit http://aka.ms/AzureDataLake.

How can we improve Microsoft Azure Data Lake?

(thinking…)

Enter your idea and we'll search to see if someone has already suggested it.

If a similar idea already exists, you can support and comment on it.

If it doesn't exist, you can post your idea so others can support it.

Enter your idea and we'll search to see if someone has already suggested it.

  1. Support querying the table created in the same script

    When a table is created in a script and there is a query on the same table within the script, there is a compile time error. It would be great to be able to do that in the same script, creating two scripts for CRUD operations on the same table seems unnecessary.

    8 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    under review  ·  3 comments  ·  Flag idea as inappropriate…  ·  Admin →
  2. USQL should be able to resolve file names at run time.

    I have a CSV file on Azure data lake store that contains file paths of XML files stored in Azure blob. After reading file paths from CSV file, I want to access and read those XML files in USQL and extract some useful information from XML files.

    8 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    under review  ·  1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  3. List the Input and Output datasets for a U-SQL job via SDK

    For telemetry / traceability purposes we need to figure out which are the Inputs / Ouputs for a Usql. This information is available on the WebUI, however it is not available via the SDK. For getting the information, we are first getting the algebra.xml file path, then fetching the file and parsing for the inputs and outputs. This is a pretty hacky way. Suggest we produce this information natively via the SDK.

    7 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    under review  ·  0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  4. Support on-premise data storage

    I would like Azure Data Lake to be able to work with data storage that is hosted on-premise.

    7 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  5. Unit test custom extractor locally

    When developing a custom extractor I would like to be able to test that, for a defined input, the extractor returns the expected output. For example, two use cases my extractor should handle are: to unescape certain escaped characters, and also to handle rows which do not have the expected number of columns. As the number of different cases that the extractor should handle grows, it becomes cumbersome to test each case using an individual EXTRACT USING query.

    7 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    under review  ·  1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  6. Azure DataLake Store integration to Azure Stack for cloud+on-prem spanning

    Integrate ADLS to Azure Stack so ADLS may span cloud and on-premise location, in order to be presented as a single endpoint with a single security, metadata and lineage (to come ?) management

    6 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Flag idea as inappropriate…  ·  Admin →

    This is an item that will be considered for the longer term. At present we do not have plans on this front. We will continue to solicit customer feedback on this.

  7. Support for Java language

    Add Java support of Azure Data Lake. In our company we have a lot Java developers so it will be useful if they interact with Azure Data Lake efficiently in Java.

    6 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    under review  ·  1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  8. Resume U-SQL job from point of failure in the next run

    What is needed?
    When the user is running a long job with a bunch of scripts, the ability to save state about where the job failed and resume from point of failure the next time around.

    E..g if there are 10 transactions and 5 operations were successful, the user can resume from the 6th operation instead of starting over.

    Why?
    In a complex script, the ability to resume transaction helps save time and cost for the user.

    6 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    under review  ·  0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  9. In future are we having any functionality like querying multiple datasource at same time like presto.

    Expose some way to create custom connector and query data to discrete datasource like MySQL, Table Storage and SQLServer and ADAL file system without importing data and run interactive query.

    6 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    under review  ·  2 comments  ·  Flag idea as inappropriate…  ·  Admin →
  10. provide api to monitor health and failures

    Users should log into the portal and view the job status manually. But we need to automate and notify or take appropriate actions.

    5 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  11. Please add support for SQL.MAP and SQL.ARRAY in built-in Extractors and Outputters

    Currently these two types are not supported because there are several different possible serialization formats. If you vote on this request, please indicate what string serialization format you want to see.

    5 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    under review  ·  0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  12. Support constant folding in U-SQL scripts in Azure Data Lake Tools for Visual Studio

    Can the IDE support features such as showing the evaluated values of a variable (from constant folding) in a tooltip when hovering over it.
    This would require a feedback channel from the constant folder to the IDE.

    It can be even more helpful with advanced features, such as intrinsics checking for the existence of a stream.
    For example, hovering over it shows immediately if the stream exists, or any other metadata.

    5 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    under review  ·  0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  13. Improve performance of writing large files to blob

    Writing an 100GB file to blob is painfully slow and the work is always put only on 1 vertex. Is there a way to parallelize the work across multiple vertexes?

    4 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    1 comment  ·  Flag idea as inappropriate…  ·  Admin →

    Unfortunately, Windows Azure Blob store does not provide for an efficient way to stitch intermediate blobs together without rereading them. Thus we recommend that you instead write the file with a wildcard into the blob store that does not stitch the files together. Eg.
    OUTPUT @result
    TO “/path/filefolder/file_{*}.csv”
    USING Outputters.Csv();

  14. USQL is a great language for manipulating data. It would be awesome if I could create a non-cloud, standalone EXE to manipulate data locally

    USQL is a great tool for slicing and dicing files. Even though it wouldn't be able to take advantage of the massively parallel nature of a VC, it would be very, very useful if I could use the USQL language to manipulate data on a local machine, as a standalone executable that I could run against my local files as needed. Since it's so data centric, it sure beats doing the same thing in, say, C# or Python.

    4 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  15. exception stream for data

    An option to add a non extractable field as a column to the output of the built-in extractors or a second rowset that contains the exceptions. We would prefer latter for now

    4 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    under review  ·  3 comments  ·  Flag idea as inappropriate…  ·  Admin →
  16. Impoved handling of column aliases

    When attempting to return all of the columns across a set of joins (eg, SELECT *), I will get an error when two columns have the same name. Ideally, I'd get similar behavior to SQL (which allows this, just aliased as tableAlias.columnName ).

    4 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    under review  ·  0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  17. Can get original data(without auto decompress) for *.gz by setting extrator

    Can get original data(without auto decompress) for *.gz by setting extrator

    4 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    under review  ·  1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  18. Support multi-select from Azure portal

    Eg.
    a) To select multiple folders and delete them together.
    b) To cancel multiple jobs together.

    3 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  19. Create a project type to hold the database schema definition just like a sqlproj. This will help in making it similar to SQL databases.

    Create a project type to hold the database schema definition just like a sqlproj. This will help in making it similar to SQL databases.

    3 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    under review  ·  0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  20. Support ACL setting to Visual Studio tool

    .NET Develper easy to dev/test in Visual Studio to configuration ACL settting as same as portal.

    3 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    under review  ·  1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  • Don't see your idea?

Feedback and Knowledge Base