Data Factory
Azure Data Factory allows you to manage the production of trusted information by offering an easy way to create, orchestrate, and monitor data pipelines over the Hadoop ecosystem using structured, semi-structures and unstructured data sources. You can connect to your on-premises SQL Server, Azure database, tables or blobs and create data pipelines that will process the data with Hive and Pig scripting, or custom C# processing. The service offers a holistic monitoring and management experience over these pipelines, including a view of their data production and data lineage down to the source systems. The outcome of Data Factory is the transformation of raw data assets into trusted information that can be shared broadly with BI and analytics tools.
Do you have an idea, suggestion or feedback based on your experience with Azure Data Factory? We’d love to hear your thoughts.
-
Dark theme for Data Factory Web UI
Dark theme for Azure Data Factory Web UI, I think it would be a nice addition for those of us who prefer dark themes in general. Also, it would be consistent with the Azure portal.
256 votes -
Allow parameters for pipeline reference name in ExecutePipeline activity
ExecutePipeline activity accept constant pipeline name. It should allow to invoke a pipeline dynamically by accepting parameter for pipeline reference name.
234 votes -
Change Data Capture feature for RDBMS (Oracle, SQL Server, SAP HANA, etc)
Is it possible to have CDC features on ADF please ?
235 votes -
Add PowerShell cmdlet support for import/export ARM Template
currently there is no means to duplicate the functionality of the import and export ARM template in the adf.azure.com GUI. this is a major gap in the devops / CI/CD story. please add powershell cmdlets that allow import and export of just the ADF factory assets. optioanly provide the ability to include the creation of the data factory in the exported template.
183 votes -
Please allow users to automate “Publish”.
Now, somebody must get to the dev ADF, open the master branch, and hit “Publish” for the changes to get the “Adf_publish” updated.
Automating “Publish” on the master branch is necessary for improving efficiency and saving more time.
164 votes -
Event Hub
Source and sink.
155 votesThank you for the feedback. We will look into this.
-
Persist global temporary tables between activities
It is currently not possible to access a global temporary table created by one activity from a subsequent activity.
If this was possible you could create a pipeline with a Copy activity chained with a Stored Procedure activity with both accessing the same global temporary table. The benefit of this is that operations against database scoped temporary tables aren't logged, so you can load millions of records in seconds.
143 votes -
ForEach activity - Allow break
Allow break a ForEach activity like ForEach works in most languages. Currently ForEach will iterate all items to end, even if we don't want it.
If I have an error in one of the items, I may want to break ForEach, stop iterating and throw that error.
For now, I have to use a flag variable and IF's to avoid ForEach to continue calling all the activities.
143 votes -
ADF connection to Azure Delta Lake
Are there any plans to provide connection between ADF v2/Managing Data Flow and Azure Delta Lake? It would be great new source and sync for ADF pipeline and Managing Data Flows to provide full ETL/ELT CDC capabilities to simplify complex lambda data warehouse architecture requirements.
143 votesData Factory now supports delta lake as source and sink in mapping data flow (public preview). Learn more from https://techcommunity.microsoft.com/t5/azure-data-factory/adf-adds-connectors-for-delta-lake-and-excel/ba-p/1515793.
Copy activity support will come later as well. -
Amazon S3 sink
We'd really need the ability to write to S3 rather than just read.
Many larger clients (big groups with multiple IT departments) will often have both Azure and Amazon and ADF is getting disqualified from the benchmarks against Talend Online and Matilion because won't push to other cloud services...
Please help ^^
137 votes -
NetSuite connector
It would be great if there was a NetSuite connector
135 votes -
Allow Stored Procedure Output Parameters
In Logic Apps, it is possible to execute a stored procedure that contains an OUTPUT parameter, read the value of that output parameter and pass it to the next activity in the Logic App, this is not possible with ADF v2.
130 votes -
Ability to name activities in Data Factory dynamically
When copying data from e.g. Azure SQL db to Azure DWH, you may want to use the FOREACH iterator similar to the pattern described in the tutorial at https://docs.microsoft.com/en-us/azure/data-factory/tutorial-bulk-copy.
The downside of the this approach is that the logging in the monitor window is somewhat useless because you cannot see what activity has failed because they are all named the same (but with different RunId's, of course).If would be much better if the name of the activity could be named during runtime, e.g CopyTableA, CopyTableB, CopyTableC instead of CopyTable, CopyTable, CopyTable.
125 votes -
125 votes
-
125 votes
-
Disable and enable data factory triggers for DevOps release pipeline
When using devops release pipelines for continuous deployment of a data factory, currently you have to manually stop and start the triggers in the target data factory. the provided powershell solution from the official docs doesn't work (anymore?). The triggers should be updated automatically on deployment https://docs.microsoft.com/en-us/azure/data-factory/continuous-integration-deployment#update-active-triggers
122 votes -
121 votes
Thank you so much for your great suggestion! We will consider this as part of our roadmap planning.
-
Allow Data Factory Managed identity to run Databricks notebooks
Integrate Azure Data Factory Managed Identity in Databricks service.. like you did for Keyvault, storage, etc.
119 votes -
Pause/Start Azure SQL Data Warehouse from ADF
Pause/Start Azure SQL Data Warehouse from ADF
119 votesThank you for your feedback. We are now evaluating this ask as part of our product roadmap.
-
Add the ability to restart an activity from within a pipeline within a master pipeline in ADFv2
If a pipeline structure is a master pipeline containing child pipelines with the activities held within these, it is not possible to restart the child pipeline and have the parent recognise when the child pipeline completes. Add the functionality to allow an activity in the child pipeline to be restarted that is then passed back to the parent pipeline when successfully completed.
114 votes
- Don't see your idea?