Data Factory
Azure Data Factory allows you to manage the production of trusted information by offering an easy way to create, orchestrate, and monitor data pipelines over the Hadoop ecosystem using structured, semi-structures and unstructured data sources. You can connect to your on-premises SQL Server, Azure database, tables or blobs and create data pipelines that will process the data with Hive and Pig scripting, or custom C# processing. The service offers a holistic monitoring and management experience over these pipelines, including a view of their data production and data lineage down to the source systems. The outcome of Data Factory is the transformation of raw data assets into trusted information that can be shared broadly with BI and analytics tools.
Do you have an idea, suggestion or feedback based on your experience with Azure Data Factory? We’d love to hear your thoughts.
-
Change Data Capture feature for RDBMS (Oracle, SQL Server, SAP HANA, etc)
Is it possible to have CDC features on ADF please ?
241 votes -
Add PowerShell cmdlet support for import/export ARM Template
currently there is no means to duplicate the functionality of the import and export ARM template in the adf.azure.com GUI. this is a major gap in the devops / CI/CD story. please add powershell cmdlets that allow import and export of just the ADF factory assets. optioanly provide the ability to include the creation of the data factory in the exported template.
189 votes -
Please allow users to automate “Publish”.
Now, somebody must get to the dev ADF, open the master branch, and hit “Publish” for the changes to get the “Adf_publish” updated.
Automating “Publish” on the master branch is necessary for improving efficiency and saving more time.
168 votes -
ForEach activity - Allow break
Allow break a ForEach activity like ForEach works in most languages. Currently ForEach will iterate all items to end, even if we don't want it.
If I have an error in one of the items, I may want to break ForEach, stop iterating and throw that error.
For now, I have to use a flag variable and IF's to avoid ForEach to continue calling all the activities.
163 votes -
Event Hub
Source and sink.
162 votesThank you for the feedback. We will look into this.
-
Amazon S3 sink
We'd really need the ability to write to S3 rather than just read.
Many larger clients (big groups with multiple IT departments) will often have both Azure and Amazon and ADF is getting disqualified from the benchmarks against Talend Online and Matilion because won't push to other cloud services...
Please help ^^
151 votes -
Persist global temporary tables between activities
It is currently not possible to access a global temporary table created by one activity from a subsequent activity.
If this was possible you could create a pipeline with a Copy activity chained with a Stored Procedure activity with both accessing the same global temporary table. The benefit of this is that operations against database scoped temporary tables aren't logged, so you can load millions of records in seconds.
151 votes -
ADF connection to Azure Delta Lake
Are there any plans to provide connection between ADF v2/Managing Data Flow and Azure Delta Lake? It would be great new source and sync for ADF pipeline and Managing Data Flows to provide full ETL/ELT CDC capabilities to simplify complex lambda data warehouse architecture requirements.
148 votesData Factory now supports delta lake as source and sink in mapping data flow (public preview). Learn more from https://techcommunity.microsoft.com/t5/azure-data-factory/adf-adds-connectors-for-delta-lake-and-excel/ba-p/1515793.
Copy activity support will come later as well. -
Disable and enable data factory triggers for DevOps release pipeline
When using devops release pipelines for continuous deployment of a data factory, currently you have to manually stop and start the triggers in the target data factory. the provided powershell solution from the official docs doesn't work (anymore?). The triggers should be updated automatically on deployment https://docs.microsoft.com/en-us/azure/data-factory/continuous-integration-deployment#update-active-triggers
141 votes -
NetSuite connector
It would be great if there was a NetSuite connector
135 votes -
Ability to name activities in Data Factory dynamically
When copying data from e.g. Azure SQL db to Azure DWH, you may want to use the FOREACH iterator similar to the pattern described in the tutorial at https://docs.microsoft.com/en-us/azure/data-factory/tutorial-bulk-copy.
The downside of the this approach is that the logging in the monitor window is somewhat useless because you cannot see what activity has failed because they are all named the same (but with different RunId's, of course).If would be much better if the name of the activity could be named during runtime, e.g CopyTableA, CopyTableB, CopyTableC instead of CopyTable, CopyTable, CopyTable.
135 votes -
Allow Stored Procedure Output Parameters
In Logic Apps, it is possible to execute a stored procedure that contains an OUTPUT parameter, read the value of that output parameter and pass it to the next activity in the Logic App, this is not possible with ADF v2.
134 votes -
Allow Data Factory Managed identity to run Databricks notebooks
Integrate Azure Data Factory Managed Identity in Databricks service.. like you did for Keyvault, storage, etc.
128 votes -
GitLab Integration in Azure Data Factory
Will be useful to have GitLab integration in Azure Data Factory along with GitHub and Azure Repos as it's one of the most popular tools
127 votes -
125 votes
-
127 votes
-
Refer to Azure Key Vault secrets in dynamic content
If I need to crawl a restful API which is protected with an API key, the only way to set that key is by injecting an additional header on the dataset level. This key is stored in clear text, which is poor security.
To make matters worse, if git integration is enabled, that key is even committed into version control.
There should be a way to fetch values from Azure Key Vault elsewhere than just for setting up linked services. Alternatively, the REST linked service should support authentication with an API key.
122 votes -
Pause/Start Azure SQL Data Warehouse from ADF
Pause/Start Azure SQL Data Warehouse from ADF
122 votesThank you for your feedback. We are now evaluating this ask as part of our product roadmap.
-
121 votes
Thank you so much for your great suggestion! We will consider this as part of our roadmap planning.
-
Get Metadata for Multiple Files Matching Wildcard
The Get Metadata Activity is not useful when there is a wild card in the dataset file path. Could the Get MetaData Activity be expanded to return the number of files found in the dataset, and an array of metadata values? We want to use this information to decide whether to continue with the remainder of the pipeline based on whether any files satisfy the wild card.
117 votes
- Don't see your idea?