Data Lake

You can use this set to communicate with the Azure Data Lake team. We are eager to hear your ideas, suggestions, or any other feedback that would help us improve the service to bet fit your needs.

If you have technical questions, please visit our forums.
If you are looking for tutorials and documentation, please visit http://aka.ms/AzureDataLake.

How can we improve Microsoft Azure Data Lake?

(thinking…)

Enter your idea and we'll search to see if someone has already suggested it.

If a similar idea already exists, you can support and comment on it.

If it doesn't exist, you can post your idea so others can support it.

Enter your idea and we'll search to see if someone has already suggested it.

  1. Get total DL Store space utilized via PowerShell

    Currently there's no ability to get the total current storage utilized in a DLS account via PowerShell, although out does appear in the account's overview tab in Azure

    1 vote
    Sign in
    Check!
    (thinking…)
    Reset
    or sign in with
    • facebook
    • google
      Password icon
      I agree to the terms of service
      Signed in as (Sign out)

      We’ll send you updates on this idea

      0 comments  ·  Flag idea as inappropriate…  ·  Admin →
    • Move/Change Visual Studio Data Lake Tools Job View Cancel Button

      In Visual Studio. In the Cloud Explorer job view for Azure Data Lake tools can you please move or change the behaviour of the red cross cancel button? See attached.

      Currently the button resides right next to the job refresh button and does not offer a confirmation prompt.

      For long running jobs I often want to manually refresh the job graph more frequently, but have accidently clicked the cancel button! This ends up costing a lot of time and money in compute because of an imprecise click! Totally my fault, but we are all human!

      Could the "dangerous" cancel button…

      1 vote
      Sign in
      Check!
      (thinking…)
      Reset
      or sign in with
      • facebook
      • google
        Password icon
        I agree to the terms of service
        Signed in as (Sign out)

        We’ll send you updates on this idea

        0 comments  ·  Flag idea as inappropriate…  ·  Admin →
      • Support querying ADLA catalogs across Azure regions

        The March 2017 ADL release introduced the ability to query ADLA catalogs across ADLA accounts 'within' the same Azure region. We'd like to see support for this 'across' Azure regions e.g. Querying an ADLA catalog in the East US region from an ADLA account in the North Europe region.

        2 votes
        Sign in
        Check!
        (thinking…)
        Reset
        or sign in with
        • facebook
        • google
          Password icon
          I agree to the terms of service
          Signed in as (Sign out)

          We’ll send you updates on this idea

          1 comment  ·  Flag idea as inappropriate…  ·  Admin →
        • Data Catalog using REST Api

          Please help with documents as to how one can Register, Publish, Annotate and create Catalog for Documents which are already there in Azure BLOB and for data set which resides in some external location using Java REST API services.
          Share the URL's if any or help with documentation specifically for Java based Azure Data Catalog operations.

          Thanks

          1 vote
          Sign in
          Check!
          (thinking…)
          Reset
          or sign in with
          • facebook
          • google
            Password icon
            I agree to the terms of service
            Signed in as (Sign out)

            We’ll send you updates on this idea

            0 comments  ·  Flag idea as inappropriate…  ·  Admin →
          • Please make Reduced On Column information available to the reducer.

            I have written several reducers for the purpose of validation of script changes. I would like to ensure I configure the reduced on columns to be included in the output schema as the first columns, and during reduction copy out the reduced on column values to the output row. Without knowing which columns were reduced on from the reducer or column data, I must pass this in to the reducer as args. This complicates implementation as now I have to keep these values in sync

            Example: REDUCE ON a, b USING MyReducer("a","b").

            I'd like to call a property in the…

            1 vote
            Sign in
            Check!
            (thinking…)
            Reset
            or sign in with
            • facebook
            • google
              Password icon
              I agree to the terms of service
              Signed in as (Sign out)

              We’ll send you updates on this idea

              2 comments  ·  Flag idea as inappropriate…  ·  Admin →
            • Attach Data Lake as a mounted drive with windows authetication

              I want to perform file operations on the files in Data Lake.

              1. The operations like zip, unzip.
              2. Calculate checksum
              3. Read, write, overwrite data.

              2 votes
              Sign in
              Check!
              (thinking…)
              Reset
              or sign in with
              • facebook
              • google
                Password icon
                I agree to the terms of service
                Signed in as (Sign out)

                We’ll send you updates on this idea

                0 comments  ·  Flag idea as inappropriate…  ·  Admin →
              • Allow multiple assembly versions in ADL Analytics Database\Assemblies

                When using 3rd party DLLs, there are dependencies on widely used DLLs such as Newtonsoft.JSON. Therefore different versions of this assemblies become deployed. What happens is that a conflict occurs. Good to have side by side execution of assembly versions.

                1 vote
                Sign in
                Check!
                (thinking…)
                Reset
                or sign in with
                • facebook
                • google
                  Password icon
                  I agree to the terms of service
                  Signed in as (Sign out)

                  We’ll send you updates on this idea

                  0 comments  ·  Flag idea as inappropriate…  ·  Admin →
                • Release ADL to Canadian region

                  HDInsight has recently (Feb 2017) been made generally availalble in Canada. Nice to have ADL made available really soon. At least in preview. Any plans to share at this time?

                  14 votes
                  Sign in
                  Check!
                  (thinking…)
                  Reset
                  or sign in with
                  • facebook
                  • google
                    Password icon
                    I agree to the terms of service
                    Signed in as (Sign out)

                    We’ll send you updates on this idea

                    8 comments  ·  Flag idea as inappropriate…  ·  Admin →
                  • support regular expression in virtual columns or make file path accessable in IExtractor

                    currently it only supports basic filtering, hard to filtering input using complex logic.

                    1 vote
                    Sign in
                    Check!
                    (thinking…)
                    Reset
                    or sign in with
                    • facebook
                    • google
                      Password icon
                      I agree to the terms of service
                      Signed in as (Sign out)

                      We’ll send you updates on this idea

                      0 comments  ·  Flag idea as inappropriate…  ·  Admin →
                    • Extract datetime using built in extractors with a timezone other than UTC-7

                      The built in extractors' current behaviour when extracting DateTime columns always set the timezone to UTC-7. It would be very helpful to have a way of extracting DateTime columns without setting the timezone to UTC-7. The current behaviour is not what users in my organisation anticipate and causes unexpected results when comparing date times.

                      I am aware that it is possible to work around this issue, by extracting a string and performing the conversion afterwards.

                      https://msdn.microsoft.com/en-us/library/azure/mt621366.aspx
                      "U-SQL’s built-in extractors normalize all date time values to UTC -07:00 and then drop timezone information if present."

                      3 votes
                      Sign in
                      Check!
                      (thinking…)
                      Reset
                      or sign in with
                      • facebook
                      • google
                        Password icon
                        I agree to the terms of service
                        Signed in as (Sign out)

                        We’ll send you updates on this idea

                        0 comments  ·  Flag idea as inappropriate…  ·  Admin →
                      • Parameter model for U-SQL scripts

                        Create a parameter model for U-SQL scripts so that values can be passed in.

                        3 votes
                        Sign in
                        Check!
                        (thinking…)
                        Reset
                        or sign in with
                        • facebook
                        • google
                          Password icon
                          I agree to the terms of service
                          Signed in as (Sign out)

                          We’ll send you updates on this idea

                          1 comment  ·  Flag idea as inappropriate…  ·  Admin →
                        • Support OVER for custom aggregates

                          With built-in aggregates, we can use the OVER keyword. But there is currently no such capability for custom aggregates.

                          2 votes
                          Sign in
                          Check!
                          (thinking…)
                          Reset
                          or sign in with
                          • facebook
                          • google
                            Password icon
                            I agree to the terms of service
                            Signed in as (Sign out)

                            We’ll send you updates on this idea

                            0 comments  ·  Flag idea as inappropriate…  ·  Admin →
                          • Allow a READONLY EXCLUDE clause

                            Listing the read-only columns for a UDO is fine when there is a known, small set of columns, but is really annoying when more of the columns are read-only than not. It would be nice if we could specify something like `READONLY EXCLUDE col2, col3`, where I specify only the columns that I know are going to be updated.

                            If a column was added to the source table, the developer has to remember to update all UDOs or not have ideal optimization. With a READONLY EXCLUDE clause, the code can run fully optimized regardless of whether surrounding columns are added…

                            1 vote
                            Sign in
                            Check!
                            (thinking…)
                            Reset
                            or sign in with
                            • facebook
                            • google
                              Password icon
                              I agree to the terms of service
                              Signed in as (Sign out)

                              We’ll send you updates on this idea

                              0 comments  ·  Flag idea as inappropriate…  ·  Admin →
                            • Ability to drop parittions (partitioned by multiple columns) by providing partial partition key

                              For a table that is partitioned on two columns, e.g., date and region, we want to drop all partitions for a given date or all partitions for a given region.

                              Currently we have to enumerate the “missing” column since U-SQL expect us to provide a full partition spec on DROP.

                              For example, for the hourly tables, we use 2 columns as partition key: DimDate + DimHour. To drop a partition, U-SQL doesn’t support dropping by a single partition column.

                              ALTER TABLE [ADLTelemetryDB].[dbo].[AllJobsHourlyTbl] DROP IF EXISTS PARTITION ("2017-01-25");

                              will return an error “The number of columns in a partition clause must…

                              1 vote
                              Sign in
                              Check!
                              (thinking…)
                              Reset
                              or sign in with
                              • facebook
                              • google
                                Password icon
                                I agree to the terms of service
                                Signed in as (Sign out)

                                We’ll send you updates on this idea

                                0 comments  ·  Flag idea as inappropriate…  ·  Admin →
                              • 8 votes
                                Sign in
                                Check!
                                (thinking…)
                                Reset
                                or sign in with
                                • facebook
                                • google
                                  Password icon
                                  I agree to the terms of service
                                  Signed in as (Sign out)

                                  We’ll send you updates on this idea

                                  0 comments  ·  Flag idea as inappropriate…  ·  Admin →
                                • Improve performance of writing large files to blob

                                  Writing an 100GB file to blob is painfully slow and the work is always put only on 1 vertex. Is there a way to parallelize the work across multiple vertexes?

                                  3 votes
                                  Sign in
                                  Check!
                                  (thinking…)
                                  Reset
                                  or sign in with
                                  • facebook
                                  • google
                                    Password icon
                                    I agree to the terms of service
                                    Signed in as (Sign out)

                                    We’ll send you updates on this idea

                                    1 comment  ·  Flag idea as inappropriate…  ·  Admin →

                                    Unfortunately, Windows Azure Blob store does not provide for an efficient way to stitch intermediate blobs together without rereading them. Thus we recommend that you instead write the file with a wildcard into the blob store that does not stitch the files together. Eg.
                                    OUTPUT @result
                                    TO “/path/filefolder/file_{*}.csv”
                                    USING Outputters.Csv();

                                  • Filesets should be able to read combination of GZip and non-gzip files

                                    Currently, I get an error when use a wildcard fileset pattern to read files that are a combination and Gzip and non-Gzip files. It would be nice if this were handled internally. Our workaround for now is the create two different extract statements (one for {*}.tsv and one for {*}.tsv.gz).

                                    1 vote
                                    Sign in
                                    Check!
                                    (thinking…)
                                    Reset
                                    or sign in with
                                    • facebook
                                    • google
                                      Password icon
                                      I agree to the terms of service
                                      Signed in as (Sign out)

                                      We’ll send you updates on this idea

                                      0 comments  ·  Flag idea as inappropriate…  ·  Admin →
                                    • Add IF FILE.MATCH_EXISTS to tell us if a fileset has at least one match

                                      Currently, IF FILE.EXISTS only supports a fully resolved file name. We want to know if our file set pattern found any matches, and then only run a script if it did.

                                      1 vote
                                      Sign in
                                      Check!
                                      (thinking…)
                                      Reset
                                      or sign in with
                                      • facebook
                                      • google
                                        Password icon
                                        I agree to the terms of service
                                        Signed in as (Sign out)

                                        We’ll send you updates on this idea

                                        0 comments  ·  Flag idea as inappropriate…  ·  Admin →
                                      • JSON support for data analytics in azure portal

                                        data analysis for JSON data hosted in data lake store need to be processed by data analytics jobs directly and easily. Lot's of customers waiting for that. There is NO clear guidance on how to use JSON data using U-SQL query, and we might be loosing lots of business as most of the customers now a days have complex JSON data to be queried.

                                        5 votes
                                        Sign in
                                        Check!
                                        (thinking…)
                                        Reset
                                        or sign in with
                                        • facebook
                                        • google
                                          Password icon
                                          I agree to the terms of service
                                          Signed in as (Sign out)

                                          We’ll send you updates on this idea

                                          0 comments  ·  Flag idea as inappropriate…  ·  Admin →
                                        • 1 vote
                                          Sign in
                                          Check!
                                          (thinking…)
                                          Reset
                                          or sign in with
                                          • facebook
                                          • google
                                            Password icon
                                            I agree to the terms of service
                                            Signed in as (Sign out)

                                            We’ll send you updates on this idea

                                            0 comments  ·  Flag idea as inappropriate…  ·  Admin →
                                          ← Previous 1 3 4 5 9 10
                                          • Don't see your idea?

                                          Feedback and Knowledge Base