Data Factory

Azure Data Factory allows you to manage the production of trusted information by offering an easy way to create, orchestrate, and monitor data pipelines over the Hadoop ecosystem using structured, semi-structures and unstructured data sources. You can connect to your on-premises SQL Server, Azure database, tables or blobs and create data pipelines that will process the data with Hive and Pig scripting, or custom C# processing. The service offers a holistic monitoring and management experience over these pipelines, including a view of their data production and data lineage down to the source systems. The outcome of Data Factory is the transformation of raw data assets into trusted information that can be shared broadly with BI and analytics tools.

Do you have an idea, suggestion or feedback based on your experience with Azure Data Factory? We’d love to hear your thoughts.

How can we improve Microsoft Azure Data Factory?

You've used all your votes and won't be able to post a new idea, but you can still search and comment on existing ideas.

There are two ways to get more votes:

  • When an admin closes an idea you've voted on, you'll get your votes back from that idea.
  • You can remove your votes from an open idea you support.
  • To see ideas you have already voted on, select the "My feedback" filter and select "My open ideas".
(thinking…)

Enter your idea and we'll search to see if someone has already suggested it.

If a similar idea already exists, you can support and comment on it.

If it doesn't exist, you can post your idea so others can support it.

Enter your idea and we'll search to see if someone has already suggested it.

  1. Rerun with upstream needs to respect dependencies

    Using the Monitor window to select "Rerun with upstream in pipeline" seems like a great feature, except that dependency ordering is not respected and the pipeline runs out of order.

    The use case I have is that I can see my output is incorrect, and I've fixed a problem in the pipeline code. Now I want to rerun that slice through the corrected pipeline to regenerate output data.

    What happens in practice is that the activities seem to run in reverse order, so that each activity runs before its own inputs are regenerated.

    I want an easy way to completely…

    1 vote
    Vote
    Sign in
    Check!
    (thinking…)
    Reset
    or sign in with
    • facebook
    • google
      Password icon
      I agree to the terms of service
      Signed in as (Sign out)
      You have left! (?) (thinking…)
      1 comment  ·  Flag idea as inappropriate…  ·  Admin →
    • Logging support for Store Procedure Pipeline

      Logging support for Store Procedure Pipeline
      I do not see an option for logging port for Store Procedure Pipeline. It will be nice if output of Store Procedure logged in pipeline

      3 votes
      Vote
      Sign in
      Check!
      (thinking…)
      Reset
      or sign in with
      • facebook
      • google
        Password icon
        I agree to the terms of service
        Signed in as (Sign out)
        You have left! (?) (thinking…)
        0 comments  ·  Flag idea as inappropriate…  ·  Admin →
      • Identify IP Address of Data Factory

        It is not currently possible to identify the IP Address of the DF, which you need for firewall rules, including Azure SQL Server firewall....

        6 votes
        Vote
        Sign in
        Check!
        (thinking…)
        Reset
        or sign in with
        • facebook
        • google
          Password icon
          I agree to the terms of service
          Signed in as (Sign out)
          You have left! (?) (thinking…)
          0 comments  ·  Flag idea as inappropriate…  ·  Admin →
        • Add serverless compute for data factory custom tasks.

          Adding Azure Batch and HDInsight for custom activities can be expensive to manage and create, especially for small tasks which are not available out of the box in ADF. Supporting Azure Functions to run custom tasks would massively help here.

          4 votes
          Vote
          Sign in
          Check!
          (thinking…)
          Reset
          or sign in with
          • facebook
          • google
            Password icon
            I agree to the terms of service
            Signed in as (Sign out)
            You have left! (?) (thinking…)
            0 comments  ·  Flag idea as inappropriate…  ·  Admin →
          • Activity monitoring seconds

            Show seconds in the monitoring pane for activity start and end dates.

            The grid is already quite full so probably just in the explorer pane.

            This would help to verify that chained activities are running in the correct sequence.

            3 votes
            Vote
            Sign in
            Check!
            (thinking…)
            Reset
            or sign in with
            • facebook
            • google
              Password icon
              I agree to the terms of service
              Signed in as (Sign out)
              You have left! (?) (thinking…)
              0 comments  ·  Flag idea as inappropriate…  ·  Admin →
            • Docker

              Please consider creating a docker image/installation of the Microsoft Data Management Gateway to allow rapid deployment that is isolated from the underlying operating system; which would avoid issues with x86 driver installation for oracle and other database sources.

              3 votes
              Vote
              Sign in
              Check!
              (thinking…)
              Reset
              or sign in with
              • facebook
              • google
                Password icon
                I agree to the terms of service
                Signed in as (Sign out)
                You have left! (?) (thinking…)
                0 comments  ·  Flag idea as inappropriate…  ·  Admin →
              • Sample Custom pipline

                In the Data Factory it would be great if we had a 1 or 2 "Custom Pipelines" - perhaps 1 as a template where a "custom script" is call for processing, and another where the "Custom script" inserts data from a file into a table.

                3 votes
                Vote
                Sign in
                Check!
                (thinking…)
                Reset
                or sign in with
                • facebook
                • google
                  Password icon
                  I agree to the terms of service
                  Signed in as (Sign out)
                  You have left! (?) (thinking…)
                  0 comments  ·  Flag idea as inappropriate…  ·  Admin →
                • Impala Data source

                  Can you please provide an copy sources from Impala (cloudera) to Data Warehouse

                  1 vote
                  Vote
                  Sign in
                  Check!
                  (thinking…)
                  Reset
                  or sign in with
                  • facebook
                  • google
                    Password icon
                    I agree to the terms of service
                    Signed in as (Sign out)
                    You have left! (?) (thinking…)
                    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
                  • Support Azure Data Lake Store as a 'scriptLinkedService' for a U-SQL activity

                    If we run a U-SQL (ADLA) activity using Data Factory and wish to reference a scriptPath which contains the full U-SQL script, the only scriptLinkedService that's supported for the scriptPath is Azure blob storage. It would be very nice to have *both* the source data and the supporting scripts all in Azure Data Lake Store. Currently we need a blob storage container just for the scripts to support the Data Lake operations.

                    16 votes
                    Vote
                    Sign in
                    Check!
                    (thinking…)
                    Reset
                    or sign in with
                    • facebook
                    • google
                      Password icon
                      I agree to the terms of service
                      Signed in as (Sign out)
                      You have left! (?) (thinking…)
                      0 comments  ·  Flag idea as inappropriate…  ·  Admin →
                    • Graceful custom activity timeout in data factory

                      Ideally I'd like to use the timeout within the data factory pipeline to solely manage the overall timeout of a custom activity, leaving the data factory monitoring pane to be the source of truth.

                      However if the timeout occurs and I was mid copying to data lake store (for example) I would want the opportunity to clean up (I can't find examples of transaction handling).

                      I've contemplated having a timeout within the custom activity as well, but it doesn't feel clean having multiple layers of timeouts. And having the hard time out on the pipeline feels right in case the…

                      4 votes
                      Vote
                      Sign in
                      Check!
                      (thinking…)
                      Reset
                      or sign in with
                      • facebook
                      • google
                        Password icon
                        I agree to the terms of service
                        Signed in as (Sign out)
                        You have left! (?) (thinking…)
                        0 comments  ·  Flag idea as inappropriate…  ·  Admin →
                      • Custom activity progress reporting

                        Data factory seems quite blind to the progress of long running custom activities. That isn't to bad for production, but for development it's nice to have more feedback.

                        It would be nice to be able to report progress back to data factory from within a custom activity.

                        Be it a progress indicator or perhaps more regular log file dumps.

                        2 votes
                        Vote
                        Sign in
                        Check!
                        (thinking…)
                        Reset
                        or sign in with
                        • facebook
                        • google
                          Password icon
                          I agree to the terms of service
                          Signed in as (Sign out)
                          You have left! (?) (thinking…)
                          0 comments  ·  Flag idea as inappropriate…  ·  Admin →
                        • Export status or metrics from the data management gateway

                          Currently, the data management gateway in Azure Data Factory cannot be monitored properly from Azure. I suggest to add metrics to monitor the status of the gateway in ADF.

                          1 vote
                          Vote
                          Sign in
                          Check!
                          (thinking…)
                          Reset
                          or sign in with
                          • facebook
                          • google
                            Password icon
                            I agree to the terms of service
                            Signed in as (Sign out)
                            You have left! (?) (thinking…)
                            0 comments  ·  Flag idea as inappropriate…  ·  Admin →
                          • Internationalised monitoring pane

                            The monitoring pane is very useful, but it's currently in the MM/DD/YYYY format which means in the UK you need to take a second glance slowing you down when using it.

                            It would be useful to have the slice times shown explicitly in UTC.

                            It should be using the regional settings in the main portal.

                            6 votes
                            Vote
                            Sign in
                            Check!
                            (thinking…)
                            Reset
                            or sign in with
                            • facebook
                            • google
                              Password icon
                              I agree to the terms of service
                              Signed in as (Sign out)
                              You have left! (?) (thinking…)
                              0 comments  ·  Flag idea as inappropriate…  ·  Admin →
                            • ADF Move NOT Copy

                              Can you please enable Data Factory to move (i.e. copy, verify and delete) files/folders. I know there is a .Net custom activity to do this but it is very difficult for those of us trying to use Azure who are technically inclined but not exactly developers. Data Factory itself is easy to figure out if you are not a developer however trying to learn .Net in a short period of time is not an easy task.
                              Also, forgive me if I'm incorrect, but I cannot develop a custom activity to delete files from an on premise file system. Currently, there…

                              6 votes
                              Vote
                              Sign in
                              Check!
                              (thinking…)
                              Reset
                              or sign in with
                              • facebook
                              • google
                                Password icon
                                I agree to the terms of service
                                Signed in as (Sign out)
                                You have left! (?) (thinking…)
                                0 comments  ·  Flag idea as inappropriate…  ·  Admin →
                              • Setting Expiration in Azure Data Lake Store

                                Ability to set Expiration of Files at the Data Lake Store when the file is created during a Copy Pipeline.

                                5 votes
                                Vote
                                Sign in
                                Check!
                                (thinking…)
                                Reset
                                or sign in with
                                • facebook
                                • google
                                  Password icon
                                  I agree to the terms of service
                                  Signed in as (Sign out)
                                  You have left! (?) (thinking…)
                                  0 comments  ·  Flag idea as inappropriate…  ·  Admin →
                                • Azure Table data sink does not allow building in a simple way re-runnable data slices

                                  I attempted unsuccessfully to build a pipeline copying monthly sales data from an (on premise) database to an Azure Table.

                                  Idea / hope was
                                  - use the data slice start date as partition key
                                  - deleting all previously loaded data with that partition key in when re-running a data slice

                                  Current functionality only provides merge or replace functionality but does not allow for purge and reload as we would do with a database table, leading to high risk of keeping orphan records from old loads.

                                  It is feasible to populate the partitionkey field with the slice start date using a…

                                  2 votes
                                  Vote
                                  Sign in
                                  Check!
                                  (thinking…)
                                  Reset
                                  or sign in with
                                  • facebook
                                  • google
                                    Password icon
                                    I agree to the terms of service
                                    Signed in as (Sign out)
                                    You have left! (?) (thinking…)
                                    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
                                  • Dataset wizard

                                    Instead of a copy wizard, a dataset wizard would be helpful. To generate a dataset that is 150 fields in a huge pain. Currently, the Copy Wizard provides too many functions. It would be nice if these were broken out, instead of providing an end to end solution.

                                    4 votes
                                    Vote
                                    Sign in
                                    Check!
                                    (thinking…)
                                    Reset
                                    or sign in with
                                    • facebook
                                    • google
                                      Password icon
                                      I agree to the terms of service
                                      Signed in as (Sign out)
                                      You have left! (?) (thinking…)
                                      0 comments  ·  Flag idea as inappropriate…  ·  Admin →
                                    • HTTP source to request Dynamics 365

                                      I am able to get Dynamics data through ODATA however it gives me Object data type in response.

                                      I am not able to use http request to get Dynamics 365 data. It fails with authentication error even after providing correct credentials on basic authentication method.

                                      1 vote
                                      Vote
                                      Sign in
                                      Check!
                                      (thinking…)
                                      Reset
                                      or sign in with
                                      • facebook
                                      • google
                                        Password icon
                                        I agree to the terms of service
                                        Signed in as (Sign out)
                                        You have left! (?) (thinking…)
                                        0 comments  ·  Flag idea as inappropriate…  ·  Admin →
                                      • HDInsight with Azure Data Lake

                                        Today you can't use an on demand or bring your own cluster of HDInsight with Data Factory as the cluster requires a blob storage linked service. We need the ability to use HDInsight clusters backed by Azure Data Lake in a Data Factory pipeline.

                                        13 votes
                                        Vote
                                        Sign in
                                        Check!
                                        (thinking…)
                                        Reset
                                        or sign in with
                                        • facebook
                                        • google
                                          Password icon
                                          I agree to the terms of service
                                          Signed in as (Sign out)
                                          You have left! (?) (thinking…)
                                          0 comments  ·  Flag idea as inappropriate…  ·  Admin →
                                        • Output dataset to have support for additional system column using sliceStart or SliceEnd

                                          Currently data set can have columns, which are defined at source or at target. It cannot be something coming from azure data factory.

                                          If Azure Data Factory supports specifying additional system columns as part of the dataset, it will be helpful to handle delta scenarios.

                                          {
                                          "name": "DeltaDateTime",
                                          "type": "SliceEnd"
                                          }

                                          11 votes
                                          Vote
                                          Sign in
                                          Check!
                                          (thinking…)
                                          Reset
                                          or sign in with
                                          • facebook
                                          • google
                                            Password icon
                                            I agree to the terms of service
                                            Signed in as (Sign out)
                                            You have left! (?) (thinking…)
                                            0 comments  ·  Flag idea as inappropriate…  ·  Admin →
                                          ← Previous 1 3 4 5 12 13
                                          • Don't see your idea?

                                          Data Factory

                                          Feedback and Knowledge Base