Activity that copies and then deletes.1,164 votes
Thank you for your suggestions and feedback! A built-in first-class Delete Activity is now available in ADF, please check out this blog: https://azure.microsoft.com/en-us/blog/clean-up-files-by-built-in-delete-activity-in-azure-data-factory/ and this document link: https://docs.microsoft.com/en-us/azure/data-factory/delete-activity
Rather than the time slice idea, allow us to schedule pipelines as jobs, the same way I would schedule an agent job to run SSIS packages. Setting availability for datasets is a very awkward way to go about this. A scheduler would be 10 times easier and more intuitive.
Also allow users to "run" a pipeline on demand, this would make testing a lot easier.748 votes
Thanks so much for your feedback! We have made great enhancements in ADFv2 to make the control flow much more flexible. Please refer to these document links on how to trigger a pipeline on-demand: https://docs.microsoft.com/en-us/azure/data-factory/delete-activity and how to create schedule trigger, tumbling trigger, and an event-based trigger: https://docs.microsoft.com/en-us/azure/data-factory/concepts-pipeline-execution-triggers#triggers
The JSON editor is OK but is still a barrier to entry. A WYSIWIG UI based on SSIS/Machine Learning Studio would really make this easier to use.358 votes
We now have a full-fledged user experience that allows you to visually author feature-rich pipelines – from data ingestion, orchestration of variety of activities, to authoring rich transformation with no code!
It'd make it much easier to adopt Data Factory if it was possible to add Azure Functions activities into a Pipeline.
You can already store a blob and make an Azure Function to trigger based on that, but having the functions directly in the pipeline source would make the Data Factory management easier. Not to mention the clarity it'd give about the Data Factory functionality.306 votes
Thank you so much for your feedback to ADF! First-class integration with Azure Function is now available as a native ADF activity: https://docs.microsoft.com/en-us/azure/data-factory/control-flow-azure-function-activity
At the moment you can only use * and ? in the file filter. It would be very helpful if you could use the partitionedBy section which you can use for the folderpath in the filefilter or the filename as well.
This would allow scenarios where you need files like myName-2015-07-01.txt where the slice date and time is part of the filename.268 votes
Thank you for your feedback. This can be accomplished in ADF V2 by passing in the value of a trigger variable using pipeline parameter and dataset parameter. Then you can construct a parameterized folder path name and/or filename, e.g. myName-yyyy-MM-dd.txt in your example.
Please refer to this article for more details: https://docs.microsoft.com/en-us/azure/data-factory/how-to-read-write-partitioned-data
Currently, ADF supports only SQL Server authentication for Azure SQL Database and Azure SQL Data Warehouse data sources. Since both Azure SQL Database and Azure SQL Data Warehouse provide AAD authentication, ADF should start supporting this.229 votes
Thank you all for your feedback. ADF now supports AAD authentication (Service principal and Managed identities for Azure resources) for both Azure SQL Database and Azure SQL Data Warehouse.
Please check out details here: https://docs.microsoft.com/en-us/azure/data-factory/connector-azure-sql-database#linked-service-properties
Need to integrate with SAP but there is no Linked Service option for SAP.191 votes
We have enabled bulk data loading from SAP BW using Open Hub: https://docs.microsoft.com/en-us/azure/data-factory/connector-sap-business-warehouse-open-hub
We also recently enabled SAP Table connector which enables data integration from SAP Table in SAP ECC, SAP BW, SAP S/4HANA, etc:
Stream analytics can write in json format to blob (line separated) but it can't be used later in data factory. this is a big miss!190 votes
JSON format is now supported in ADF. Refer to https://azure.microsoft.com/en-us/documentation/articles/data-factory-azure-blob-connector/#azure-blob-dataset-type-properties
We have a job that needs to be run throughout the day. Sometimes the job runs long, sometimes it runs short. We need to start the job as soon as it finishes. We can't use tumbling window because we disable the job at night, and when we re-enable a tumbling window, it wants to run all the jobs it missed.
Please add a concurrency flag to the schedule trigger, or add a scheduling component to the tumbling window trigger to disable the trigger during certain times.158 votes
A new “concurrency” setting has been added to the pipeline definition. If you set it to 1, there will only be at most 1 pipeline run at a time.
Look for it on the pipeline authoring UX. As an example here is the REST API for pipeline creation: https://docs.microsoft.com/en-us/rest/api/datafactory/pipelines/createorupdate#request-body
Now that we have Azure functions available that can help make batch data processing more real-time, it would be great to be able to programmatically invoke the workflow component of ADF, for immediate execution of the pipeline.157 votes
You can do this in ADF V2 using any of the programmatic interfaces: PSH command, .NET SDK, Python SDK, and REST API.
Thanks for your feedback. Now you can use ADF to copy data from FTP/s into various data stores. You are invited to give it a try.
You can use the copy wizard to easily author the copy pipeline. And the documentation on FTP/s connector can be found at https://azure.microsoft.com/en-us/documentation/articles/data-factory-ftp-connector/.
Note SFTP is not covered in this connector, it will be worked out later. And write to FTP is not covered now.
Add configurable REST and SOAP Web Service sources, so it can ingest data from other cloud services.
There are many cloud applications that expose data via a SOAP or REST api. Customers should be able to configure generic REST and SOAP data sources for use in Azure Data Factory. Other ELT and ETL tools such as Dell Boomi, Informatica, SSIS and Talend have this functionality.148 votes
Thank you for your inputs! REST connector is now available: https://docs.microsoft.com/en-us/azure/data-factory/connector-rest
There is a limitation to use DMG for a single DF in a machine. It cannot be shared among different DFs. we need to move to another machine if we have to use the same gateway for another DF or create a gateway to connect different on-premise sql swerver.137 votes
Thanks for the feedback! This feature is now fully enabled:
Now with the Azure Database for Postgresql GA and available as ADF's source, really want to have it as sink as well to fulfil our data loading requirement.134 votes
Azure Data Factory now support Azure Database for PostgreSQL as sink. Check the documentation: https://docs.microsoft.com/en-us/azure/data-factory/connector-azure-database-for-postgresql
We have added a first-class Delete Activity which accomplishes this requirement: https://docs.microsoft.com/en-us/azure/data-factory/delete-activity
Provide a folder option to manage multiple datasets and piplelines. ADF Diagram also based on Folder/Area.
We have around 100 datasets and 10 different pipelines. This will grow to 2000 datasets and 150 pipelines in future based on business functionality, data categorization and dependency. We already see a problem in managing it, as we are not able to arrange them in a folder(collapse) and diagram for it becomes difficult to explain. If we introduce a functionality to manage all datasets/pipeline related to one area in a specific folder and also have diagram feature specific to that folder level, it will simplify it a lot. See Before and After attachment for more clarity.118 votes
Thank you everyone for your feedback in this area! We have enabled the ability to create folders in ADF authoring UI: launch “Author & Monitor” from factory blade → from left nav of resource explorer → click on “+” sign to create folder for Pipelines, Datasets, Data Flows, and Templates.
Additionally, with the rich parameterization support in ADF V2, you can use do dynamic lookup and pass in an array of values into a parameterized dataset, which drastically reduces the need to create or maintain large number of hard-coded datasets or pipelines. Please refer to this as a concrete example of using Lookup+ForEach+Copy to load from a large set of tables: https://docs.microsoft.com/en-us/azure/data-factory/tutorial-bulk-copy-portal
MySQL as destination data Source.93 votes
Azure Data Factory now support Azure Database for MySQL as sink. Check this service update: https://azure.microsoft.com/updates/azure-data-factory-now-supports-copying-data-into-azure-database-for-mysql/
With data copy activity, it will be massively helpful to have pipeline of the type - Slowly Changing Dimension capability or similar to Merge functionality , where the pipeline can perform data validation before inserting. This is one of the great features in SSIS and will be great to have it in ADF.92 votes
We're using Azure Key Vault to manage all of our secrets and credentials for consumption by client applications. It would be nice if we could add Key Vaults as a source of connection strings or other variables in our JSON, and it would be populated by the Data Factory service at deploy/runtime.90 votes
Thanks for your feedback. This is now supported: https://docs.microsoft.com/en-us/azure/data-factory/store-credentials-in-key-vault
Currently only FTPs is supported. Please add support for SFTP both as source and sink86 votes
Copying data from SFTP is live now, refer to connector doc on more details: https://docs.microsoft.com/en-us/azure/data-factory/data-factory-sftp-connector. If you are looking for copying data into SFTP (as sink), vote/comment this item for tracking: https://feedback.azure.com/forums/270578-data-factory/suggestions/18742906-support-sftp-as-sink
- Don't see your idea?