Data Lake

You can use this set to communicate with the Azure Data Lake team. We are eager to hear your ideas, suggestions, or any other feedback that would help us improve the service to bet fit your needs.

If you have technical questions, please visit our forums.
If you are looking for tutorials and documentation, please visit http://aka.ms/AzureDataLake.

  1. Support columns of type TimeSpan

    Currently USQL doesn't support columns of type TimeSpan, which means that:

    1. In order to perform operations on TimeSpan values, or DateTime values, one needs to convert the TimeSpan to Ticks, perform the operation
    2. Once we have the calculated value as Ticks, we can't convert it back to TimeSpan as a new columns (SELECT new TimeSpan(x) AS duration FROM Table)

    18 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    under review  ·  3 comments  ·  Flag idea as inappropriate…  ·  Admin →
  2. Provide an API that allows exporting data from ADLS tables

    There isnt an API currently that will let me consume data existing in a table in ADLS. Forking data to a file (for which there is a download API) isnt ideal because of differences in schema on write vs schema on read, duplication, latency, etc.

    18 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    under review  ·  2 comments  ·  Flag idea as inappropriate…  ·  Admin →
  3. Provide support for installing NuGet packages on databases

    When creating custom components such as extractors you can register the assemblies from within Visual Studio to a specific Azure Data Lake Analytics database.

    However when you want to share your components with someone or re-use in multiple projects/databases you need to get the DLLs from the source, register all of them and your good to go.

    It would be nice to be able to use NuGet packages to distribute your re-usable code. This would also allow use to open-source our components and allow other people to easily re-use these components.

    18 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    2 comments  ·  Flag idea as inappropriate…  ·  Admin →
    under review  ·  Saveen responded

    Thanks for the feedback Tom.

    If I understand this scenario correctly, you want to be able to use a U-SQL Catalog almost like a NuGet feed? So that for example, someone could pull and register assemblies into their VS Projects from a U-SQL Catalog. Do I have that right?

  4. Support Federated Distributed U-SQL Queries among ADLA accounts

    Currently U-SQL supports federated distributed queries between one ADLA account with SQL Servers in Azure in any data center.

    Please add support for federating and distributing U-SQL queries across several ADLA accounts, either in the same data centers or in others (once they become available). This allows the query to get executed closer to its data.

    15 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    under review  ·  0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  5. PowerBI content pack with useful graphs set up to help me undersatnd usage

    I love the fact that multiple users can submit jobs. I'd love to be able to have a nice clean dashboard that I can use in PowerBI that gives an overview of who is doing what, accessing what data, and consuming resources.

    15 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    under review  ·  1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  6. 14 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    under review  ·  0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  7. Allow dead code

    When I'm debugging, dead code can be valuable so I don't have to comment out huge portions of code when I just want to quickly check something.

    13 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    under review  ·  0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  8. Add support for SQL CASE Statement

    While U-SQL is using C# as an expression language, there are some frequent expressions in SQL that would help migrating Hive or SQL scripts to U-SQL, similar to how you have added LIKE.

    13 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    under review  ·  2 comments  ·  Flag idea as inappropriate…  ·  Admin →
  9. Please add support for SQL COALESCE

    While U-SQL is using C# as an expression language, there are some expressions in SQL that are cumbersome to write in C#. Please add support for COALESCE (https://msdn.microsoft.com/en-us/library/ms190349.aspx) to U-SQL similar to how you have added LIKE.

    13 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  10. Support U-SQL for non Big data scenarios as for WebApps needing to query AzureSQL with AzureTables with DocumentDB on a large scale dataset

    Many a times web developers of multi tenant systems need to query several SQL servers, the document DB and other storage types. A unified query platform could replace the otherwise mediator pattern used.

    13 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    1 comment  ·  Flag idea as inappropriate…  ·  Admin →
    under review  ·  matt winkler responded

    This is directionally aligned with how we see customers using U-SQL. It is not on our short term roadmap at this point in time.

  11. Add option to built-in Extractors to truncate or ignore values that are too long

    Since string data types are limited to 128kB and byte[] are also limited in the built-in extractors (normally byte[] is 4MB but in built-in it is much less since it goes through strings :(), it would be very useful to either be able to ignore rows with values that are too long, or at least give the option to truncate with a warning.

    12 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    under review  ·  1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  12. Support for multiple file set paths

    It would make code shorter if we could combine multiple file sets in one EXTRACT.
    Here is an example :

    DECLARE CONST @set1 = "wasb://mycontainer1@myaccount.blob.core.windows.net/{*}/XXXXXX-{date:yyyy}{date:MM}{date:dd}.json
    DECLARE CONST @set2 = "wasb://mycontainer2@myaccount.blob.core.windows.net/{*}/XXXXXX-{date:yyyy}{date:MM}{date:dd}.json
    DECLARE CONST @set3 = "wasb://mycontainer3@myaccount.blob.core.windows.net/{*}/XXXXXX-{date:yyyy}{date:MM}{date:dd}.json

    EXTRACT my_data,
    date DateTime
    FROM @set1, @set2, @set3
    USING Extractors.Text();

    At the time of writing, we get error message "Syntax not supported: Streamset not supported in file list".

    Thank You.

    This request is related to this post in ADL Forum :
    https://social.msdn.microsoft.com/Forums/azure/en-US/d65e54b1-9122-496a-9ba6-74da5cae082a/syntax-not-supported-streamset-not-supported-in-file-list?forum=AzureDataLake

    11 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  13. Column Annotation Support

    We would like the ability to provide hints against columns that indicate cross-dataset relationships. Building on top of this, this foreign-key style hint could allow the visual studio tools to suggest joins across multiple streams where the hints are the same value.

    10 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    under review  ·  1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  14. Machine Learning (ML) C# package is needed.

    Please provide a python "sklearn" style package in C# that U-SQL can call.

    This is really important as much of the big data workload is machine learning. At least some basic classification and regression methods to start with.

    10 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    under review  ·  2 comments  ·  Flag idea as inappropriate…  ·  Admin →
  15. Tool or powershell command to list all the folder sizes recursively.

    Please enable a feature - tool or powershell command to list all the folder sizes recursively. This will help to find the Top consumer of space.

    9 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    under review  ·  1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  16. Missing out of box Alerting and notification of failures

    Need this out of box to take critical dependency

    9 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Flag idea as inappropriate…  ·  Admin →

    Hello! Can you provide more details on what kind of failures you would like to be alerted/notified and how you would like to be alerted/notified? For example, do you want to be sent an email every time an ADLA job failed?

  17. Support more wildcards in ADLA file sets

    It would be nice to have more wildcards besides the asterisk in file sets. Suppose we've got two sets of files like eg

    file01.tbl
    file02.tbl

    and

    file0101.tbl
    file0102.tbl
    file0201.tbl
    file0202.tbl

    So it's impossible to select just one of the two sets since the syntax

    @set1 = EXTRACT ..... FROM "/file{*}.tbl" USING .....;

    matches all the files. The proposal is to allow another wildcard like eg ? to mean a single character, so we could do eg

    @set1 = EXTRACT ..... FROM "/file{??}.tbl" USING .....;
    @set2 = EXTRACT ..... FROM "/file{????}.tbl" USING .....;

    Of course the actual syntax/wildcard does not have…

    9 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    3 comments  ·  Flag idea as inappropriate…  ·  Admin →
  18. 9 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    under review  ·  2 comments  ·  Flag idea as inappropriate…  ·  Admin →
  19. Provide IFormatter<T> for User Defined Types without Attribute

    Currently, in order to use a user defined type as a column type, it must be attributed with SqlUserDefinedTypeAttribute to provide an IFormatter.

    However, this makes it highly inconvenient to use types that are defined from external libraries or projects that don't have a U-SQL dependency.

    Please provide an alternative mechanism for supplying a Type to IFormatter mapping.

    8 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    under review  ·  1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  20. Support information about number of files and records in SDK

    With ADLA Sdk we can get job information such as job id. name etc. however there is no information about number of files processed in the job or the number of records. This information is currently available from "system" folder in ADLS(as well as web portal) and will be helpful in Job telemetry as well as historical analysis. Kindly add support of detailed job telemetry to sdk.

    8 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  • Don't see your idea?

Data Lake

Categories

Feedback and Knowledge Base