Data Factory

Azure Data Factory allows you to manage the production of trusted information by offering an easy way to create, orchestrate, and monitor data pipelines over the Hadoop ecosystem using structured, semi-structures and unstructured data sources. You can connect to your on-premises SQL Server, Azure database, tables or blobs and create data pipelines that will process the data with Hive and Pig scripting, or custom C# processing. The service offers a holistic monitoring and management experience over these pipelines, including a view of their data production and data lineage down to the source systems. The outcome of Data Factory is the transformation of raw data assets into trusted information that can be shared broadly with BI and analytics tools.

Do you have an idea, suggestion or feedback based on your experience with Azure Data Factory? We’d love to hear your thoughts.

  1. Change Data Capture feature for RDBMS (Oracle, SQL Server, SAP HANA, etc)

    Is it possible to have CDC features on ADF please ?

    211 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    planned  ·  11 comments  ·  Flag idea as inappropriate…  ·  Admin →
  2. Azure Data Factory - Restart an entire pipeline

    Currently in Azure Data Factory, there is no functionality to restart an entire Pipeline. If we need to refresh a dataset in Azure, all associated activities in the pipeline will have to be selected and run separately. Can we have an option where we could run the entire pipeline if required.

    190 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    5 comments  ·  Flag idea as inappropriate…  ·  Admin →
  3. Post-copy script in Copy Activity

    In copy activity there is a feature of pre-copy script. Similarly if there is post-copy script feature it will help to execute code post copy operation is completed from same activity.

    Traditionally when data is being copied from source sql to destination sql, the data is copied incrementally from source to temporary/stage tables/in-memory tables in destination. Post copy the merge code is executed to merge data into target table.

    If post-copy script option is provided in copy activity it will help to call the merge code from copy activity instead of calling another activity like Execute stored procedure.

    188 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    6 comments  ·  Flag idea as inappropriate…  ·  Admin →
  4. Allow parameters for pipeline reference name in ExecutePipeline activity

    ExecutePipeline activity accept constant pipeline name. It should allow to invoke a pipeline dynamically by accepting parameter for pipeline reference name.

    165 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    6 comments  ·  Flag idea as inappropriate…  ·  Admin →
  5. Add PowerShell cmdlet support for import/export ARM Template

    currently there is no means to duplicate the functionality of the import and export ARM template in the adf.azure.com GUI. this is a major gap in the devops / CI/CD story. please add powershell cmdlets that allow import and export of just the ADF factory assets. optioanly provide the ability to include the creation of the data factory in the exported template.

    142 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    4 comments  ·  Flag idea as inappropriate…  ·  Admin →
  6. Event Hub

    Source and sink.

    144 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    6 comments  ·  Flag idea as inappropriate…  ·  Admin →
  7. ADF connection to Azure Delta Lake

    Are there any plans to provide connection between ADF v2/Managing Data Flow and Azure Delta Lake? It would be great new source and sync for ADF pipeline and Managing Data Flows to provide full ETL/ELT CDC capabilities to simplify complex lambda data warehouse architecture requirements.

    137 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    10 comments  ·  Flag idea as inappropriate…  ·  Admin →
  8. 125 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    3 comments  ·  Flag idea as inappropriate…  ·  Admin →
  9. Persist global temporary tables between activities

    It is currently not possible to access a global temporary table created by one activity from a subsequent activity.

    If this was possible you could create a pipeline with a Copy activity chained with a Stored Procedure activity with both accessing the same global temporary table. The benefit of this is that operations against database scoped temporary tables aren't logged, so you can load millions of records in seconds.

    121 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    2 comments  ·  Flag idea as inappropriate…  ·  Admin →
  10. NetSuite connector

    It would be great if there was a NetSuite connector

    118 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  11. Pause/Start Azure SQL Data Warehouse from ADF

    Pause/Start Azure SQL Data Warehouse from ADF

    111 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    5 comments  ·  Flag idea as inappropriate…  ·  Admin →
  12. Save git branch settings on server side/Factory wide

    The customer need to use specific branch for Data Factory resource but for now, branch setting is saved to cookie as <user>_preference and we have to answer the "Branch selection" dialogue every time the cache was cleared or accessing from the different machine/user.
    Please add a functionality to save this to Factory wide settings to avoid user error.

    107 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  13. Richer variable support

    Allow me to have custom variables at a pipeline and Factory level, which can be refreshed at a specified schedule from a dataset -- the closest analogue for this would be SSIS variables

    One use case for this would be for me to store a set of UTC offsets in a SQL table for each data source, and query this table at pipe runtime to retrieve the correct offset for each source. This offset can then be stored in variables for each pipeline

    107 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    4 comments  ·  Flag idea as inappropriate…  ·  Admin →
  14. Dark theme for Data Factory Web UI

    Dark theme for Azure Data Factory Web UI, I think it would be a nice addition for those of us who prefer dark themes in general. Also, it would be consistent with the Azure portal.

    106 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  15. Allow Stored Procedure Output Parameters

    In Logic Apps, it is possible to execute a stored procedure that contains an OUTPUT parameter, read the value of that output parameter and pass it to the next activity in the Logic App, this is not possible with ADF v2.

    105 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    4 comments  ·  Flag idea as inappropriate…  ·  Admin →
  16. Ability to name activities in Data Factory dynamically

    When copying data from e.g. Azure SQL db to Azure DWH, you may want to use the FOREACH iterator similar to the pattern described in the tutorial at https://docs.microsoft.com/en-us/azure/data-factory/tutorial-bulk-copy.
    The downside of the this approach is that the logging in the monitor window is somewhat useless because you cannot see what activity has failed because they are all named the same (but with different RunId's, of course).

    If would be much better if the name of the activity could be named during runtime, e.g CopyTableA, CopyTableB, CopyTableC instead of CopyTable, CopyTable, CopyTable.

    104 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    13 comments  ·  Flag idea as inappropriate…  ·  Admin →
  17. Support encrypted flat files as the source in copy activities

    We use this approach to encrypt sensitive flat files at rest. Please add a feature to ADF to support reading from encrypted flat files in blob storage:
    https://azure.microsoft.com/en-us/documentation/articles/storage-encrypt-decrypt-blobs-key-vault/

    103 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    9 comments  ·  Flag idea as inappropriate…  ·  Admin →
  18. 102 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    6 comments  ·  Flag idea as inappropriate…  ·  Admin →
  19. Amazon S3 sink

    We'd really need the ability to write to S3 rather than just read.

    Many larger clients (big groups with multiple IT departments) will often have both Azure and Amazon and ADF is getting disqualified from the benchmarks against Talend Online and Matilion because won't push to other cloud services...

    Please help ^^

    101 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    5 comments  ·  Flag idea as inappropriate…  ·  Admin →
  20. Disable and enable data factory triggers for DevOps release pipeline

    When using devops release pipelines for continuous deployment of a data factory, currently you have to manually stop and start the triggers in the target data factory. the provided powershell solution from the official docs doesn't work (anymore?). The triggers should be updated automatically on deployment https://docs.microsoft.com/en-us/azure/data-factory/continuous-integration-deployment#update-active-triggers

    100 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  • Don't see your idea?

Data Factory

Categories

Feedback and Knowledge Base