Data Factory

Azure Data Factory allows you to manage the production of trusted information by offering an easy way to create, orchestrate, and monitor data pipelines over the Hadoop ecosystem using structured, semi-structures and unstructured data sources. You can connect to your on-premises SQL Server, Azure database, tables or blobs and create data pipelines that will process the data with Hive and Pig scripting, or custom C# processing. The service offers a holistic monitoring and management experience over these pipelines, including a view of their data production and data lineage down to the source systems. The outcome of Data Factory is the transformation of raw data assets into trusted information that can be shared broadly with BI and analytics tools.

Do you have an idea, suggestion or feedback based on your experience with Azure Data Factory? We’d love to hear your thoughts.

  1. Static IP ranges for Data Factory and add ADF to list of Trusted Azure Services

    It is not currently possible to identify the IP Address of the DF, which you need for firewall rules, including Azure SQL Server firewall....

    1,685 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    64 comments  ·  Flag idea as inappropriate…  ·  Admin →

    Thank you for your suggestions and your patience! We are working hard to enable the following enhancements for better network security:
    - Adding ADF to the list of “Trusted Azure service” for Azure Key Vault and Azure Storage (blob/ADLS Gen2). ETA is in the upcoming weeks.
    - Static IP range for Azure Integration Runtime so that you can whitelist specific IP ranges for ADF as part of firewall rules. ETA is next few months.
    - Support service tag for ADF

    We will provide an update as soon as any of the above becomes available.

  2. Support SFTP as sink

    Support pushing data into SFTP in copy activity.

    756 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    42 comments  ·  Flag idea as inappropriate…  ·  Admin →
  3. Add support for Power Query / Power BI Data Catalog as Data Store/ Linked Service

    Power Query is awesome! It would be a great feature to be able to output its result into either a SQL database or Azure (Storage or SQL).

    454 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    11 comments  ·  Flag idea as inappropriate…  ·  Admin →

    Please check the new capability we recently unveiled called Wrangling Data Flows, available in preview! Wrangling Data Flow allows you to discover and explore your data using the familiar Power Query Online mashup editor to do data preparation, and then execute at scale using Spark runtime.

    Sign up for preview access at: https://forms.office.com/Pages/ResponsePage.aspx?id=v4j5cvGGr0GRqy180BHbR9-OHbkcd7NIvtztVhbGIU9UNk5QM0dSWkFDSkFPUlowTFJMRVZUUUZGRi4u and check out more details at https://aka.ms/wranglingdfdocs

  4. Web and ODATA connectors need to support OAuth

    the web and odata connectors need to add support for OAuth ASAP. Most other Microsoft services (Office 365, PWA, CRM, etc, etc, etc) along with many other industry API's require the use of OAuth. Not having this closes the door to lots of integration scenarios.

    287 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    40 comments  ·  Flag idea as inappropriate…  ·  Admin →
  5. Support Snowflake as Sink

    Provide the capability to copy data from Blob to Snowflake data warehouse

    249 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    7 comments  ·  Flag idea as inappropriate…  ·  Admin →
  6. List of currently progressing slices, recently ran at the global view

    Would be nice to have this list, and have it updated on the fly. Would save hunting through the factory to find which slices were being processed.

    Also would be nice to see highlighting, possibly with some animation to indicate which pipelines are currently being ran in the diagram view.

    21 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    started  ·  0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  7. Provide Parallelism in Copying data from Hive to SQL Server IAAS

    Need your expert input to improve the performance of copy activity from our Hive Table to SQL Server using ADF Pipeline.Currently copy activity happening in Single threaded mode, it is taking 150 Mins to copy 20 Gb of data.this 20Gb data has been splitted into multiple files internally by hive, we see it hold 51 files , is there way in ADF to parallel load these files into SQL Server.
    Note: Internal hive Splited files are not managed by us, it is generated automatically by Hive. File naming convention is not known inside hive table(folder) in blob.

    20 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  8. Provide better billing statistics

    Provide better cost analysis possibilities (either in azure portal or in adf). Right now it is impossible to see costs by pipelines or activities - you can only see overall cost of whole data factory instance which is quite not useful.
    Please add billing per pipeline (logic apps is a good example where you can track costs per each logic app)

    17 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  9. Support generation of Datasets from a linked service query/schema

    Datasets can be long to describe if many table should be handled by data factory. Having a schema generator that can be based on a linked service could save a lot of time.

    11 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    started  ·  0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  10. Add better exception handling to Data Management Gateway

    When using Data Management Gateway to connect to on-prem SQL, errors returned by SQL Server explaining why connections weren't successful during credential setting aren't being surfaced. This makes it hard to troubleshoot problems with Data Management Gateway.

    9 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  11. IoT Sample pipeline

    It would be nice to have a sample where we can use Data Factory in an IoT scenario to get started more quickly.

    I would really appreciate this!

    8 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    started  ·  2 comments  ·  Flag idea as inappropriate…  ·  Admin →
  12. Unstructured Data

    More file formats should be allowed, could not see copy to azure blob support PDF,Word,Images formats and more others.

    It would be really great if we could have some process in place to read PDF, Word, Images (unstructured data).

    7 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →

    You can currently copy any file format via the copy activity. Simply do not provide the structure element in the dataset. But we do want to surface this in a first class manner.

  13. Multiple line queries with syntax highlighting in portal editor

    Currently, a pipeline query in the portal editor can only be one line, with no syntax highlighting.

    This makes it hard to read and edit, easy to introduce errors (particularly when escaping characters), and hard to spot them.

    Please add syntax highlighting, and allow the query to span multiple lines (even when in a Text.Format macro).

    6 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Flag idea as inappropriate…  ·  Admin →

    Thanks for your feedback. We are working on an authoring experience that will allow you to use the syntax highlighting. For the query spanning multiple lines, you can store your query in your storage account and refer the path in ‘scriptpath’ parameter. This will allow your query to span multiple lines while using ‘Text.Format’.

  14. Azure Data Factory Visual Studio 2015 Deployment Rights

    At present you need co-admin rights to deploy. Businesses cannot give out these rights. As a subscription owner I should be able to deploy from VS as these rights give me access in the portal to create and delete!

    5 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    2 comments  ·  Flag idea as inappropriate…  ·  Admin →
  • Don't see your idea?

Data Factory

Categories

Feedback and Knowledge Base