Data Factory

Azure Data Factory allows you to manage the production of trusted information by offering an easy way to create, orchestrate, and monitor data pipelines over the Hadoop ecosystem using structured, semi-structures and unstructured data sources. You can connect to your on-premises SQL Server, Azure database, tables or blobs and create data pipelines that will process the data with Hive and Pig scripting, or custom C# processing. The service offers a holistic monitoring and management experience over these pipelines, including a view of their data production and data lineage down to the source systems. The outcome of Data Factory is the transformation of raw data assets into trusted information that can be shared broadly with BI and analytics tools.

Do you have an idea, suggestion or feedback based on your experience with Azure Data Factory? We’d love to hear your thoughts.

  1. Identify IP Address of Data Factory

    It is not currently possible to identify the IP Address of the DF, which you need for firewall rules, including Azure SQL Server firewall....

    1,375 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    48 comments  ·  Flag idea as inappropriate…  ·  Admin →

    Thank you for your suggestion. We understand it is super important to whitelist specific IP list for ADF as part of firewall rules and avoid opening network access to all. We are working on this with high priority. Once this is ready, we will also add ADF to the list of “Trusted Azure service” for Azure Storage and Azure SQL DB/DW.

  2. Add support for Power Query / Power BI Data Catalog as Data Store/ Linked Service

    Power Query is awesome! It would be a great feature to be able to output its result into either a SQL database or Azure (Storage or SQL).

    450 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    11 comments  ·  Flag idea as inappropriate…  ·  Admin →

    Please check the new capability we recently unveiled called Wrangling Data Flows, available in preview! Wrangling Data Flow allows you to discover and explore your data using the familiar Power Query Online mashup editor to do data preparation, and then execute at scale using Spark runtime.

    Sign up for preview access at: https://forms.office.com/Pages/ResponsePage.aspx?id=v4j5cvGGr0GRqy180BHbR9-OHbkcd7NIvtztVhbGIU9UNk5QM0dSWkFDSkFPUlowTFJMRVZUUUZGRi4u and check out more details at https://aka.ms/wranglingdfdocs

  3. Web and ODATA connectors need to support OAuth

    the web and odata connectors need to add support for OAuth ASAP. Most other Microsoft services (Office 365, PWA, CRM, etc, etc, etc) along with many other industry API's require the use of OAuth. Not having this closes the door to lots of integration scenarios.

    265 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    37 comments  ·  Flag idea as inappropriate…  ·  Admin →
  4. Postgresql as sink

    Now with the Azure Database for Postgresql GA and available as ADF's source, really want to have it as sink as well to fulfil our data loading requirement.

    97 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    5 comments  ·  Flag idea as inappropriate…  ·  Admin →
  5. Support MySQL as sink

    MySQL as destination data Source.

    90 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    8 comments  ·  Flag idea as inappropriate…  ·  Admin →
  6. List of currently progressing slices, recently ran at the global view

    Would be nice to have this list, and have it updated on the fly. Would save hunting through the factory to find which slices were being processed.

    Also would be nice to see highlighting, possibly with some animation to indicate which pipelines are currently being ran in the diagram view.

    20 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    started  ·  0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  7. Provide Parallelism in Copying data from Hive to SQL Server IAAS

    Need your expert input to improve the performance of copy activity from our Hive Table to SQL Server using ADF Pipeline.Currently copy activity happening in Single threaded mode, it is taking 150 Mins to copy 20 Gb of data.this 20Gb data has been splitted into multiple files internally by hive, we see it hold 51 files , is there way in ADF to parallel load these files into SQL Server.
    Note: Internal hive Splited files are not managed by us, it is generated automatically by Hive. File naming convention is not known inside hive table(folder) in blob.

    20 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  8. Support generation of Datasets from a linked service query/schema

    Datasets can be long to describe if many table should be handled by data factory. Having a schema generator that can be based on a linked service could save a lot of time.

    11 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    started  ·  0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  9. Add better exception handling to Data Management Gateway

    When using Data Management Gateway to connect to on-prem SQL, errors returned by SQL Server explaining why connections weren't successful during credential setting aren't being surfaced. This makes it hard to troubleshoot problems with Data Management Gateway.

    9 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  10. IoT Sample pipeline

    It would be nice to have a sample where we can use Data Factory in an IoT scenario to get started more quickly.

    I would really appreciate this!

    8 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    started  ·  2 comments  ·  Flag idea as inappropriate…  ·  Admin →
  11. Unstructured Data

    More file formats should be allowed, could not see copy to azure blob support PDF,Word,Images formats and more others.

    It would be really great if we could have some process in place to read PDF, Word, Images (unstructured data).

    7 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →

    You can currently copy any file format via the copy activity. Simply do not provide the structure element in the dataset. But we do want to surface this in a first class manner.

  12. Multiple line queries with syntax highlighting in portal editor

    Currently, a pipeline query in the portal editor can only be one line, with no syntax highlighting.

    This makes it hard to read and edit, easy to introduce errors (particularly when escaping characters), and hard to spot them.

    Please add syntax highlighting, and allow the query to span multiple lines (even when in a Text.Format macro).

    6 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Flag idea as inappropriate…  ·  Admin →

    Thanks for your feedback. We are working on an authoring experience that will allow you to use the syntax highlighting. For the query spanning multiple lines, you can store your query in your storage account and refer the path in ‘scriptpath’ parameter. This will allow your query to span multiple lines while using ‘Text.Format’.

  13. Azure Data Factory Visual Studio 2015 Deployment Rights

    At present you need co-admin rights to deploy. Businesses cannot give out these rights. As a subscription owner I should be able to deploy from VS as these rights give me access in the portal to create and delete!

    5 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    2 comments  ·  Flag idea as inappropriate…  ·  Admin →
  • Don't see your idea?

Data Factory

Feedback and Knowledge Base