Data Factory

Azure Data Factory allows you to manage the production of trusted information by offering an easy way to create, orchestrate, and monitor data pipelines over the Hadoop ecosystem using structured, semi-structures and unstructured data sources. You can connect to your on-premises SQL Server, Azure database, tables or blobs and create data pipelines that will process the data with Hive and Pig scripting, or custom C# processing. The service offers a holistic monitoring and management experience over these pipelines, including a view of their data production and data lineage down to the source systems. The outcome of Data Factory is the transformation of raw data assets into trusted information that can be shared broadly with BI and analytics tools.

Do you have an idea, suggestion or feedback based on your experience with Azure Data Factory? We’d love to hear your thoughts.

How can we improve Microsoft Azure Data Factory?

You've used all your votes and won't be able to post a new idea, but you can still search and comment on existing ideas.

There are two ways to get more votes:

  • When an admin closes an idea you've voted on, you'll get your votes back from that idea.
  • You can remove your votes from an open idea you support.
  • To see ideas you have already voted on, select the "My feedback" filter and select "My open ideas".
(thinking…)

Enter your idea and we'll search to see if someone has already suggested it.

If a similar idea already exists, you can support and comment on it.

If it doesn't exist, you can post your idea so others can support it.

Enter your idea and we'll search to see if someone has already suggested it.

  1. HBase sink for Azure Data Factory

    Azure Data Factory would be a nice tool to precompute (materialize) views from raw data and store the results in HBase for fast access, but it is missing an HBase sink

    3 votes
    Vote
    Sign in
    Check!
    (thinking…)
    Reset
    or sign in with
    • facebook
    • google
      Password icon
      I agree to the terms of service
      Signed in as (Sign out)
      You have left! (?) (thinking…)
      0 comments  ·  Flag idea as inappropriate…  ·  Admin →
    • Add scriptPath option to externalise oracleReaderQuery to a .sql file on blob storage

      Add scriptPath option to externalise oracleReaderQuery to a .sql file on blob storage for on-prem Oracle or SQL Server sources. Similar to the option already available for HDInsightHive scripts.

      6 votes
      Vote
      Sign in
      Check!
      (thinking…)
      Reset
      or sign in with
      • facebook
      • google
        Password icon
        I agree to the terms of service
        Signed in as (Sign out)
        You have left! (?) (thinking…)
        0 comments  ·  Flag idea as inappropriate…  ·  Admin →
      • Reset a DataFactory (Dev process)

        Having a button that would delete all resources in a DataFactory. When I am testing in Visual Studio with the Publish feature, I need to delete them manually first in the portal so I don't get scheduling errors, then I republish all. Option in VS + Portal would be really appreciated, thx!

        1 vote
        Vote
        Sign in
        Check!
        (thinking…)
        Reset
        or sign in with
        • facebook
        • google
          Password icon
          I agree to the terms of service
          Signed in as (Sign out)
          You have left! (?) (thinking…)
          0 comments  ·  Flag idea as inappropriate…  ·  Admin →
        • Delete Activity Logs

          Being able to delete past Activities (Logs) in Resource Explorer from Activity Window. Testing when developing with many activities / pipelines makes search difficult even with filtering options.

          1 vote
          Vote
          Sign in
          Check!
          (thinking…)
          Reset
          or sign in with
          • facebook
          • google
            Password icon
            I agree to the terms of service
            Signed in as (Sign out)
            You have left! (?) (thinking…)
            0 comments  ·  Flag idea as inappropriate…  ·  Admin →
          • Conditional execution of activities

            There is usually two paths in a ETL workflow, business logic based activities and error activities. If the business functionality acitivity fails, workflow should branch to Error activity which could do implement some functionality to enable easier reruns.
            Also this conditional branching of ETL workflows can based on parameters can make the ETL workflow easily extensible.

            2 votes
            Vote
            Sign in
            Check!
            (thinking…)
            Reset
            or sign in with
            • facebook
            • google
              Password icon
              I agree to the terms of service
              Signed in as (Sign out)
              You have left! (?) (thinking…)
              0 comments  ·  Flag idea as inappropriate…  ·  Admin →
            • Add a new email activity with the ability to send attachments as part of the workflow.

              There are numerous instances when an output (statistics) or error file has to be mailed to administrators. Email as an activity will help in implementing this functionality

              1 vote
              Vote
              Sign in
              Check!
              (thinking…)
              Reset
              or sign in with
              • facebook
              • google
                Password icon
                I agree to the terms of service
                Signed in as (Sign out)
                You have left! (?) (thinking…)
                0 comments  ·  Flag idea as inappropriate…  ·  Admin →
              • scrape a website like wikipedia

                scrape a website like wikipedia

                1 vote
                Vote
                Sign in
                Check!
                (thinking…)
                Reset
                or sign in with
                • facebook
                • google
                  Password icon
                  I agree to the terms of service
                  Signed in as (Sign out)
                  You have left! (?) (thinking…)
                  0 comments  ·  Flag idea as inappropriate…  ·  Admin →
                • Rename objects in the portal

                  Provide the ability to rename all objects and update their associated scripts. Right now deleting a dataset removes its slice history which can get very problematic.

                  The ability to update the dataset's name and availability without having to recreate it would be very useful.

                  2 votes
                  Vote
                  Sign in
                  Check!
                  (thinking…)
                  Reset
                  or sign in with
                  • facebook
                  • google
                    Password icon
                    I agree to the terms of service
                    Signed in as (Sign out)
                    You have left! (?) (thinking…)
                    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
                  • Create an option to pass custom SOQL query to Salesforce to pull object

                    Current salesforce connector is great however there is need for an option to pass custom SOQL query to Salesforce to pull object. We have a need to create dynamic SOQL query based upon various business rules on runtime and fetch the object from Salesforce.

                    2 votes
                    Vote
                    Sign in
                    Check!
                    (thinking…)
                    Reset
                    or sign in with
                    • facebook
                    • google
                      Password icon
                      I agree to the terms of service
                      Signed in as (Sign out)
                      You have left! (?) (thinking…)
                      0 comments  ·  Flag idea as inappropriate…  ·  Admin →
                    • Add Ability to Select Existing Linked Service in "Copy data" Wizard

                      Currently when using the 'Copy data' wizard, you have to create a new linked service every time. Since I am copying sets of data in a specific order, I am creating multiple pipelines and am having to create a new linked service for the same data source and destination. So, when copying three chunks of data via separate pipelines, I end up with 6 linked services instead of 1 (two per pipeline).

                      14 votes
                      Vote
                      Sign in
                      Check!
                      (thinking…)
                      Reset
                      or sign in with
                      • facebook
                      • google
                        Password icon
                        I agree to the terms of service
                        Signed in as (Sign out)
                        You have left! (?) (thinking…)
                        3 comments  ·  Flag idea as inappropriate…  ·  Admin →
                      • Twitter Sentiment Analysis

                        Read Twitter messages using Twitter's streaming API and store them somewhere in a data lake or such. Run Sentiment Analysis from Machine Learning studio or Cognitive services and visualise results in Power BI.

                        1 vote
                        Vote
                        Sign in
                        Check!
                        (thinking…)
                        Reset
                        or sign in with
                        • facebook
                        • google
                          Password icon
                          I agree to the terms of service
                          Signed in as (Sign out)
                          You have left! (?) (thinking…)
                          0 comments  ·  Flag idea as inappropriate…  ·  Admin →
                        • Add Support for Maintaining Identity Column Values When Copying From/To SQL DBs

                          When moving data from one SQL database to another (on prem or Azure), if there is an Identity column in the source table that has a gap (e.g. the ID's are 1, 2, 4, 5), and the destination table is empty with the same structure, those values in the destination table after copy will be 1, 2, 3, 4 rather than maintaining the values. This can cause issues when the Identity column is referenced as a foreign key.

                          It would be nice to see an option to keep identity values intact, even if it means that tables for which this…

                          23 votes
                          Vote
                          Sign in
                          Check!
                          (thinking…)
                          Reset
                          or sign in with
                          • facebook
                          • google
                            Password icon
                            I agree to the terms of service
                            Signed in as (Sign out)
                            You have left! (?) (thinking…)
                            0 comments  ·  Flag idea as inappropriate…  ·  Admin →
                          • data source Google Big Query / Google Analytics

                            It would be great if we could query directly to BigQuery to disclose Google Analytics.

                            1 vote
                            Vote
                            Sign in
                            Check!
                            (thinking…)
                            Reset
                            or sign in with
                            • facebook
                            • google
                              Password icon
                              I agree to the terms of service
                              Signed in as (Sign out)
                              You have left! (?) (thinking…)
                              0 comments  ·  Flag idea as inappropriate…  ·  Admin →
                            • Data split options other then Time series i.e. split on other column values (state,category etc.)

                              If the data is not time series based or we want to ignore datetime column and split data based on some category, state, country etc. Not every incoming data has time intervals along with it

                              6 votes
                              Vote
                              Sign in
                              Check!
                              (thinking…)
                              Reset
                              or sign in with
                              • facebook
                              • google
                                Password icon
                                I agree to the terms of service
                                Signed in as (Sign out)
                                You have left! (?) (thinking…)
                                0 comments  ·  Flag idea as inappropriate…  ·  Admin →
                              • use OData connector as destination

                                I'm able to use the OData connector only as source, but not as destination. Did I miss something?

                                1 vote
                                Vote
                                Sign in
                                Check!
                                (thinking…)
                                Reset
                                or sign in with
                                • facebook
                                • google
                                  Password icon
                                  I agree to the terms of service
                                  Signed in as (Sign out)
                                  You have left! (?) (thinking…)
                                  0 comments  ·  Flag idea as inappropriate…  ·  Admin →
                                • Allow values in data to be used within the blob/file path

                                  BLOB storage as an output should allow me to use a value in the data (e.g. product-code) as a dynamic value in the file path (e.g. /[product-code]/[year]/...). As data is processed, new path/filename combinations are auto-created (e.g. /product-A/2017/... and /product-B/2017/...

                                  3 votes
                                  Vote
                                  Sign in
                                  Check!
                                  (thinking…)
                                  Reset
                                  or sign in with
                                  • facebook
                                  • google
                                    Password icon
                                    I agree to the terms of service
                                    Signed in as (Sign out)
                                    You have left! (?) (thinking…)
                                    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
                                  • Azure Data Factory - Restart an entire pipeline

                                    Currently in Azure Data Factory, there is no functionality to restart an entire Pipeline. If we need to refresh a dataset in Azure, all associated activities in the pipeline will have to be selected and run separately. Can we have an option where we could run the entire pipeline if required.

                                    19 votes
                                    Vote
                                    Sign in
                                    Check!
                                    (thinking…)
                                    Reset
                                    or sign in with
                                    • facebook
                                    • google
                                      Password icon
                                      I agree to the terms of service
                                      Signed in as (Sign out)
                                      You have left! (?) (thinking…)
                                      0 comments  ·  Flag idea as inappropriate…  ·  Admin →
                                    • Batch processing interval

                                      One major issue we are facing is related to batch process interval of 15 minutes. We have a requirement of instantaneous processing of data as soon as it is received. if it can be achieved, it will be suitable for near real time data processing.

                                      5 votes
                                      Vote
                                      Sign in
                                      Check!
                                      (thinking…)
                                      Reset
                                      or sign in with
                                      • facebook
                                      • google
                                        Password icon
                                        I agree to the terms of service
                                        Signed in as (Sign out)
                                        You have left! (?) (thinking…)
                                        1 comment  ·  Flag idea as inappropriate…  ·  Admin →
                                      • Access/Mapping the File Name during the copy process to a SQL Datatable

                                        I need a way to store the FileName that is been copied to a SQL Datatable mapped column. Will be great to have access to other file properties like size, rowcount, etc. But the file name will help us to work with undo processes.

                                        26 votes
                                        Vote
                                        Sign in
                                        Check!
                                        (thinking…)
                                        Reset
                                        or sign in with
                                        • facebook
                                        • google
                                          Password icon
                                          I agree to the terms of service
                                          Signed in as (Sign out)
                                          You have left! (?) (thinking…)
                                          0 comments  ·  Flag idea as inappropriate…  ·  Admin →
                                        • sas

                                          Include connection support for Azure Blob Storage SaS Data Source

                                          5 votes
                                          Vote
                                          Sign in
                                          Check!
                                          (thinking…)
                                          Reset
                                          or sign in with
                                          • facebook
                                          • google
                                            Password icon
                                            I agree to the terms of service
                                            Signed in as (Sign out)
                                            You have left! (?) (thinking…)
                                            0 comments  ·  Flag idea as inappropriate…  ·  Admin →
                                          ← Previous 1 3 4 5 9 10
                                          • Don't see your idea?

                                          Data Factory

                                          Feedback and Knowledge Base