Data Factory

Azure Data Factory allows you to manage the production of trusted information by offering an easy way to create, orchestrate, and monitor data pipelines over the Hadoop ecosystem using structured, semi-structures and unstructured data sources. You can connect to your on-premises SQL Server, Azure database, tables or blobs and create data pipelines that will process the data with Hive and Pig scripting, or custom C# processing. The service offers a holistic monitoring and management experience over these pipelines, including a view of their data production and data lineage down to the source systems. The outcome of Data Factory is the transformation of raw data assets into trusted information that can be shared broadly with BI and analytics tools.

Do you have an idea, suggestion or feedback based on your experience with Azure Data Factory? We’d love to hear your thoughts.

How can we improve Microsoft Azure Data Factory?

You've used all your votes and won't be able to post a new idea, but you can still search and comment on existing ideas.

There are two ways to get more votes:

  • When an admin closes an idea you've voted on, you'll get your votes back from that idea.
  • You can remove your votes from an open idea you support.
  • To see ideas you have already voted on, select the "My feedback" filter and select "My open ideas".
(thinking…)

Enter your idea and we'll search to see if someone has already suggested it.

If a similar idea already exists, you can support and comment on it.

If it doesn't exist, you can post your idea so others can support it.

Enter your idea and we'll search to see if someone has already suggested it.

  1. Support SFTP as sink

    Support pushing data into SFTP in copy activity.

    0 votes
    Vote
    Sign in
    Check!
    (thinking…)
    Reset
    or sign in with
    • facebook
    • google
      Password icon
      I agree to the terms of service
      Signed in as (Sign out)
      You have left! (?) (thinking…)
      0 comments  ·  Flag idea as inappropriate…  ·  Admin →
    • Copy Wizard should use Polybase to export from SQL DW to blob

      Currently it appears that the ADF copy wizard does not use Polybase in SQL DW (CREATE EXTERNAL TABLE AS SELECT...) in order to export the contents of a table into blob storage. As this will be much faster, please support this. Also, if you're using the copy wizard to copy from DW to DW, please Polybase out and Polybase in.

      The same should apply to SQL DW to Azure Data Lake Store as Polybase is now supported to do that.

      6 votes
      Vote
      Sign in
      Check!
      (thinking…)
      Reset
      or sign in with
      • facebook
      • google
        Password icon
        I agree to the terms of service
        Signed in as (Sign out)
        You have left! (?) (thinking…)
        1 comment  ·  Flag idea as inappropriate…  ·  Admin →
      • capture file name as a variable

        ADF should provide the ability to capture Input File Name and other file related parameters as a variable and pass it as input to other activities like Stored Procedure Activity, custom .NET activity etc. Currently on successful completion of ADF, I would like to store the filename in my on premise (or Azure) SQL DB. Today it is not possible with ADF SP Activity to pass the filename as input parameter.

        5 votes
        Vote
        Sign in
        Check!
        (thinking…)
        Reset
        or sign in with
        • facebook
        • google
          Password icon
          I agree to the terms of service
          Signed in as (Sign out)
          You have left! (?) (thinking…)
          0 comments  ·  Flag idea as inappropriate…  ·  Admin →
        • Copy Activity should coerce data type values using type information in data sink structure

          Hi,

          I propose that copy activity supports type coercion between source and sink data types.

          For example, if source column is a String field containing "True", and sink column is a Boolean column, then the sink should be written as a true boolean and not a 4 character string.

          (or whatever alternate mapping you like - this is how Convert.ToBoolean(String) works in c#)

          Regards
          Ben

          5 votes
          Vote
          Sign in
          Check!
          (thinking…)
          Reset
          or sign in with
          • facebook
          • google
            Password icon
            I agree to the terms of service
            Signed in as (Sign out)
            You have left! (?) (thinking…)
            0 comments  ·  Flag idea as inappropriate…  ·  Admin →
          • dataset policy only applies to blob - make it also available for azure data lake store

            dataset policy (see below) in azure data factory - can it be applied to azure data lake store? it currenyly only applies to Blob.

            "policy":

            {
            "validation":
            {
            "minimumSizeMB": 10.0
            }
            }

            1 vote
            Vote
            Sign in
            Check!
            (thinking…)
            Reset
            or sign in with
            • facebook
            • google
              Password icon
              I agree to the terms of service
              Signed in as (Sign out)
              You have left! (?) (thinking…)
              0 comments  ·  Flag idea as inappropriate…  ·  Admin →
            • Configure copy activity to accept missing blob

              In some pipelines, I have a copy activity which copies from a blob container to a database table, but where it is not an error if the blob container is empty. If empty, I would like the pipeline to stop without an error. Could be configurable if it's an error or not.

              4 votes
              Vote
              Sign in
              Check!
              (thinking…)
              Reset
              or sign in with
              • facebook
              • google
                Password icon
                I agree to the terms of service
                Signed in as (Sign out)
                You have left! (?) (thinking…)
                0 comments  ·  Flag idea as inappropriate…  ·  Admin →
              • Support the Partition By Clause for Azure Table Datasets

                Allow datasets with a type of 'AzureTable' to accept the partition by clause meaning slice start and end date values can be injected dynamically into the table name attribute.

                "tableName": "Table_{SliceStartYear}"

                See example attached.

                3 votes
                Vote
                Sign in
                Check!
                (thinking…)
                Reset
                or sign in with
                • facebook
                • google
                  Password icon
                  I agree to the terms of service
                  Signed in as (Sign out)
                  You have left! (?) (thinking…)
                  0 comments  ·  Flag idea as inappropriate…  ·  Admin →
                • Encrypted Zip file support

                  It would be very helpful to have AES-256 encrypted zip file support to simplify pipelines, rather than needing azure batch or functions.

                  17 votes
                  Vote
                  Sign in
                  Check!
                  (thinking…)
                  Reset
                  or sign in with
                  • facebook
                  • google
                    Password icon
                    I agree to the terms of service
                    Signed in as (Sign out)
                    You have left! (?) (thinking…)
                    1 comment  ·  Flag idea as inappropriate…  ·  Admin →
                  • In-Memory Dataset

                    When chaining multiple activities together, it is unhelpful to be required to serialize and de-serialize into storage or sql at every step, and the performance penalty is huge.

                    I propose an In-Memory Dataset type that can be used to connect activities directly together without going via intermediate persistent storage.

                    I appreciate that this is only going to be feasible where the sequence of activity steps is running within the same execution environment - would need some kind of configuration validation.

                    From the viewpoint of custom activities, this proposal would also be dependent on the following being implemented:

                    https://feedback.azure.com/forums/270578-data-factory/suggestions/18519259-improved-customisation-framework-1-decouple-ap

                    4 votes
                    Vote
                    Sign in
                    Check!
                    (thinking…)
                    Reset
                    or sign in with
                    • facebook
                    • google
                      Password icon
                      I agree to the terms of service
                      Signed in as (Sign out)
                      You have left! (?) (thinking…)
                      0 comments  ·  Flag idea as inappropriate…  ·  Admin →
                    • Improved Customisation Framework (1) - Decouple API for Custom Activity from Dataset

                      The ability to create custom .net activities is great, but I feel that the current approach limits the full potential for an ecosystem to grow around data factory within (and across) enterprises.

                      Currently a custom activity must include code to directly access the source and sink datasets, for example through exercising the Azure Storage API, despite the existence of already working out-of-the-box integrations to these APIs used in non-custom activities. If I was building a custom activity, say to apply business rules from an external engine, I would need to make a different custom activity implementation for each potential combination…

                      2 votes
                      Vote
                      Sign in
                      Check!
                      (thinking…)
                      Reset
                      or sign in with
                      • facebook
                      • google
                        Password icon
                        I agree to the terms of service
                        Signed in as (Sign out)
                        You have left! (?) (thinking…)
                        1 comment  ·  Flag idea as inappropriate…  ·  Admin →
                      • Add DocumentDb update support

                        For now, only insert operations is supported by DocumentDb connector via Copy Activity. It should allow to update documents when a id is specified and it is already there.

                        When we try to do so now we receive an error messagem like the one attached.

                        5 votes
                        Vote
                        Sign in
                        Check!
                        (thinking…)
                        Reset
                        or sign in with
                        • facebook
                        • google
                          Password icon
                          I agree to the terms of service
                          Signed in as (Sign out)
                          You have left! (?) (thinking…)
                          1 comment  ·  Flag idea as inappropriate…  ·  Admin →
                        • Integrate with Logic Apps

                          Similar in concept to other requests here for API-oriented connections, there is a general need to identify/build time-sliced data in batches, but then individually process each item in the batch.

                          I propose that a first-class integration of Logic Apps as a Sink be provided, such that a Logic App instance can started for each item in a slice. i.e. Each line of data to be assembed into JSON and passed as body attributes to a Logic App HTTP trigger endpoint.

                          5 votes
                          Vote
                          Sign in
                          Check!
                          (thinking…)
                          Reset
                          or sign in with
                          • facebook
                          • google
                            Password icon
                            I agree to the terms of service
                            Signed in as (Sign out)
                            You have left! (?) (thinking…)
                            2 comments  ·  Flag idea as inappropriate…  ·  Admin →
                          • HBase sink for Azure Data Factory

                            Azure Data Factory would be a nice tool to precompute (materialize) views from raw data and store the results in HBase for fast access, but it is missing an HBase sink

                            3 votes
                            Vote
                            Sign in
                            Check!
                            (thinking…)
                            Reset
                            or sign in with
                            • facebook
                            • google
                              Password icon
                              I agree to the terms of service
                              Signed in as (Sign out)
                              You have left! (?) (thinking…)
                              0 comments  ·  Flag idea as inappropriate…  ·  Admin →
                            • Add scriptPath option to externalise oracleReaderQuery to a .sql file on blob storage

                              Add scriptPath option to externalise oracleReaderQuery to a .sql file on blob storage for on-prem Oracle or SQL Server sources. Similar to the option already available for HDInsightHive scripts.

                              6 votes
                              Vote
                              Sign in
                              Check!
                              (thinking…)
                              Reset
                              or sign in with
                              • facebook
                              • google
                                Password icon
                                I agree to the terms of service
                                Signed in as (Sign out)
                                You have left! (?) (thinking…)
                                0 comments  ·  Flag idea as inappropriate…  ·  Admin →
                              • Reset a DataFactory (Dev process)

                                Having a button that would delete all resources in a DataFactory. When I am testing in Visual Studio with the Publish feature, I need to delete them manually first in the portal so I don't get scheduling errors, then I republish all. Option in VS + Portal would be really appreciated, thx!

                                1 vote
                                Vote
                                Sign in
                                Check!
                                (thinking…)
                                Reset
                                or sign in with
                                • facebook
                                • google
                                  Password icon
                                  I agree to the terms of service
                                  Signed in as (Sign out)
                                  You have left! (?) (thinking…)
                                  0 comments  ·  Flag idea as inappropriate…  ·  Admin →
                                • Delete Activity Logs

                                  Being able to delete past Activities (Logs) in Resource Explorer from Activity Window. Testing when developing with many activities / pipelines makes search difficult even with filtering options.

                                  1 vote
                                  Vote
                                  Sign in
                                  Check!
                                  (thinking…)
                                  Reset
                                  or sign in with
                                  • facebook
                                  • google
                                    Password icon
                                    I agree to the terms of service
                                    Signed in as (Sign out)
                                    You have left! (?) (thinking…)
                                    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
                                  • Conditional execution of activities

                                    There is usually two paths in a ETL workflow, business logic based activities and error activities. If the business functionality acitivity fails, workflow should branch to Error activity which could do implement some functionality to enable easier reruns.
                                    Also this conditional branching of ETL workflows can based on parameters can make the ETL workflow easily extensible.

                                    7 votes
                                    Vote
                                    Sign in
                                    Check!
                                    (thinking…)
                                    Reset
                                    or sign in with
                                    • facebook
                                    • google
                                      Password icon
                                      I agree to the terms of service
                                      Signed in as (Sign out)
                                      You have left! (?) (thinking…)
                                      0 comments  ·  Flag idea as inappropriate…  ·  Admin →
                                    • Add a new email activity with the ability to send attachments as part of the workflow.

                                      There are numerous instances when an output (statistics) or error file has to be mailed to administrators. Email as an activity will help in implementing this functionality

                                      3 votes
                                      Vote
                                      Sign in
                                      Check!
                                      (thinking…)
                                      Reset
                                      or sign in with
                                      • facebook
                                      • google
                                        Password icon
                                        I agree to the terms of service
                                        Signed in as (Sign out)
                                        You have left! (?) (thinking…)
                                        0 comments  ·  Flag idea as inappropriate…  ·  Admin →
                                      • scrape a website like wikipedia

                                        scrape a website like wikipedia

                                        1 vote
                                        Vote
                                        Sign in
                                        Check!
                                        (thinking…)
                                        Reset
                                        or sign in with
                                        • facebook
                                        • google
                                          Password icon
                                          I agree to the terms of service
                                          Signed in as (Sign out)
                                          You have left! (?) (thinking…)
                                          0 comments  ·  Flag idea as inappropriate…  ·  Admin →
                                        • Rename objects in the portal

                                          Provide the ability to rename all objects and update their associated scripts. Right now deleting a dataset removes its slice history which can get very problematic.

                                          The ability to update the dataset's name and availability without having to recreate it would be very useful.

                                          2 votes
                                          Vote
                                          Sign in
                                          Check!
                                          (thinking…)
                                          Reset
                                          or sign in with
                                          • facebook
                                          • google
                                            Password icon
                                            I agree to the terms of service
                                            Signed in as (Sign out)
                                            You have left! (?) (thinking…)
                                            0 comments  ·  Flag idea as inappropriate…  ·  Admin →
                                          ← Previous 1 3 4 5 9 10
                                          • Don't see your idea?

                                          Data Factory

                                          Feedback and Knowledge Base