Data Factory
Azure Data Factory allows you to manage the production of trusted information by offering an easy way to create, orchestrate, and monitor data pipelines over the Hadoop ecosystem using structured, semi-structures and unstructured data sources. You can connect to your on-premises SQL Server, Azure database, tables or blobs and create data pipelines that will process the data with Hive and Pig scripting, or custom C# processing. The service offers a holistic monitoring and management experience over these pipelines, including a view of their data production and data lineage down to the source systems. The outcome of Data Factory is the transformation of raw data assets into trusted information that can be shared broadly with BI and analytics tools.
Do you have an idea, suggestion or feedback based on your experience with Azure Data Factory? We’d love to hear your thoughts.
-
Support ADF Projects in Visual Studio 2017
Currently Visual Studio 2017 does not support Azure Data Factory projects.
Despite the Azure SDK now being included in VS2017 with all other services the ADF project files aren't.
Can you please include this feature so developers can upgrade from VS2015?
Thanks
2,526 votes -
Static IP ranges for Data Factory and add ADF to list of Trusted Azure Services
It is not currently possible to identify the IP Address of the DF, which you need for firewall rules, including Azure SQL Server firewall....
1,982 votesGreat news – static IP range for Azure Integration Runtime is now available in all ADF regions! You can whitelist specific IP ranges for ADF as part of firewall rules. The IPs are documented here: https://docs.microsoft.com/en-us/azure/data-factory/azure-integration-runtime-ip-addresses#azure-integration-runtime-ip-addresses-specific-regions. Static IP ranges for gov cloud and China cloud will be published soon!
Please refer to this blog post on how you can use various mechanisms including trusted Azure service and static IP to secure data access through ADF:
https://techcommunity.microsoft.com/t5/azure-data-factory/azure-data-factory-now-supports-static-ip-address-ranges/ba-p/1117508Service tag support will be made available in next few weeks. Please stay tuned!
If your network security requirement calls for ADF support for VNet and cannot be met using Trusted Azure service (released in Oct 2019), static IP range (released in Jan 2020), or service tag (upcoming), please vote for VNet feature here: https://feedback.azure.com/forums/270578-data-factory/suggestions/37105363-data-factory-should-be-able-to-use-vnet-without-re
-
Ability to Disable an Activity
Please allow setting a certain activity to enable or disabled, pretty much like you can do in SSIS.
This is important when you are developing and only want to execute a certain part of the pipeline for example
1,594 votes -
Throw Error Activity
If my pipeline orchestrates an asynchronous operation like processing an Azure Analysis Services model that the pattern is to start the operation asynchronously then loop and check the status. If the status is failed the REST API just says status=Failed but the REST API does not return an HTTP 500 status code so ADF does not fail. So I need a new Throw Error Activity component that will let me build an expression for the ErrorMessage property and throw an error message. Better yet there would be a property which does not throw an error if the property is set…
875 votes -
Azure Data Factory - Google Analytics Connector
Some customers have the necessity to extract information from Google Analytics in order to create a data lake or sql dw to gather marketing insights mixing another kind of data.
Now we have some custom SSIS packages that are paid or developing some custom code.
Or if it is not possible in Azure Data Factory, could have anoter way to extract this data in native connector in Azure … maybe Logic Apps
798 votesThank you for your feedback. We will evaluate this ask as part of the product roadmap.
-
Unit Testing for ADF Projects
There has to be support for automated testing of Azure Data Factory pipelines - perhaps as part of Visual Studio ADF project suite.
780 votes -
Need of Execute SQL Task in Azure Data Factory v2
We only have a execute stored procedure in ADFv2. But most of the time we don't want to create stored procedure for all of the primary ETL tasks, such as counting the no. of records from a table, Updating data into tables, creating tables, etc. There are many such activities need T-SQL Execution. It would be great if we have Execution SQL option.
ADFv2 have the option to use variety of RDBMS source and sink systems such as MySQL, Oracle, etc., . ESQL would be the powerful task to have in Azure Data Factory V2 to be used in all…
722 votesThank you for your feedback! We will evaluate this as part of our product roadmap.
-
Add ability to customize output fields from Execute Pipeline Activity
This request comes directly from a StackOverflow post, https://stackoverflow.com/questions/57749509/how-to-get-custom-output-from-an-executed-pipeline .
Currently, the output from the execute pipeline activity is limited to the pipeline's name and runId of the executed pipeline, making it difficult to pass any data or settings from the executed pipeline back to the parent pipeline - for instance, if a variable is set in the child pipeline, there is no in-built way to pass this variable in the Azure Data Factory UI. There exists a couple of workarounds as detailed in the above StackOverflow post, but adding this as an inbuilt feature would greatly enhance the ability…709 votes -
Data factory should be able to use VNet without resorting to self hosted
Self hosted makes a lot of sense when integrating on-premise data, however it's a shame to need to maintain a self-hosted integration runtime VM when wishing to leverage the extra security of a VNet i.e. firewalled storage accounts etc.
Ideally the azure managed integration runtimes would be able to join a vnet on demand.
705 votesWe are very excited to announce the public preview of Azure Data Factory Managed Virtual Network.
With this new feature, you can provision the Azure Integration Runtime in Managed Virtual Network and leverage Private Endpoints to securely connect to supported data stores. Your data traffic between Azure Data Factory Managed Virtual Network and data stores goes through Azure Private Link which provides secured connectivity and eliminate your data exposure to the public internet. With the Managed Virtual Network along with Private Endpoints, you can also offload the burden of managing virtual network to Azure Data Factory and protect against the data exfiltration.
To learn more about Azure Data Factory Managed Virtual Network, see https://azure.microsoft.com/blog/azure-data-factory-managed-virtual-network/
-
refreshing Azure Analysis Cube
Azure Data Factory pipeline activity to refresh Azure analysis services cube partitions.
626 votesFor now check and use the custom activity sample from https://github.com/Azure/Azure-DataFactory/tree/master/Samples/AzureAnalysisServicesProcessSample and this is also mentioned in the AS GA blog (https://azure.microsoft.com/en-us/blog/announcing-azure-analysis-services-general-availability/). We are further evaluating the future native 1st-class support.
-
Allow choosing logical AND or logical OR in activity dependencies
We have activity dependencies today, but they are always logical AND. If we have Activity1 -> Activity 2 -> Activity3 and we want to say if any of these activities fail, run activity 4, it isn't straight forward. In SSIS, we can choose an expression and choose whether we need one or all conditions to be true when there are multiple constraints. We need similar functionality here. It can be achieved with a bit of creativity (repeat the failure activity as the single failure path after each of the original activities use the If Condition to write logic that would…
541 votes -
Bitbucket Integration
We need to use bitbucket for a project. We are mirroring our azure devops repo with the pipelines to bitbucket. It would be easier if there was integration with bitbucket.
491 votesThank you for your feedback. We will evaluate this ask as part of our product roadmap.
-
Support for Daylight Savings Time for Trigger Schedules
Setting up the timing of a Trigger, you need to know how far away from UTC you are so you can specify the right time. That value changes for those of us that observe Daylight Savings Time.
The dialog box for setting up a Trigger Schedule should instead have the following three inputs:
1) the LOCAL time you want it to run
2) the Time Zone
3)Adjust for DST.THAT is the information people have at their disposal.
To adjust for DST, I must EDIT all my Triggers manually to ensure they run at the right hour of the day…483 votesReleased for Schedule Triggers
-
Add support for Power Query / Power BI Data Catalog as Data Store/ Linked Service
Power Query is awesome! It would be a great feature to be able to output its result into either a SQL database or Azure (Storage or SQL).
477 votesPlease check the new capability we recently unveiled called Wrangling Data Flows, available in preview! Wrangling Data Flow allows you to discover and explore your data using the familiar Power Query Online mashup editor to do data preparation, and then execute at scale using Spark runtime.
Sign up for preview access at: https://forms.office.com/Pages/ResponsePage.aspx?id=v4j5cvGGr0GRqy180BHbR9-OHbkcd7NIvtztVhbGIU9UNk5QM0dSWkFDSkFPUlowTFJMRVZUUUZGRi4u and check out more details at https://aka.ms/wranglingdfdocs
-
Powershell Script support in Activity
Please add support to run a Powershell Script as an activity inside the Azure DataFactory. It will help developers to break most of the shorting coming with scripting.
378 votes -
Please add support to specify longer timeout for Web Activity
Data Factory version 2 currently supports Web Activities with a default timeout of 1 minute:
https://docs.microsoft.com/en-us/azure/data-factory/control-flow-web-activity
"REST endpoints that the web activity invokes must return a response of type JSON. The activity will timeout at 1 minute with an error if it does not receive a response from the endpoint."
Please add ability to specify a longer timeout period for complex tasks.
368 votes -
Web and ODATA connectors need to support OAuth
the web and odata connectors need to add support for OAuth ASAP. Most other Microsoft services (Office 365, PWA, CRM, etc, etc, etc) along with many other industry API's require the use of OAuth. Not having this closes the door to lots of integration scenarios.
364 votesNow OData connector support AAD-based OAuth, you can try it out via copy wizard. And more details on OData connector can be found at https://azure.microsoft.com/en-us/documentation/articles/data-factory-odata-connector/. Keep this feedback live – if you need OAuth for Web, please leave comment.
-
Post-copy script in Copy Activity
In copy activity there is a feature of pre-copy script. Similarly if there is post-copy script feature it will help to execute code post copy operation is completed from same activity.
Traditionally when data is being copied from source sql to destination sql, the data is copied incrementally from source to temporary/stage tables/in-memory tables in destination. Post copy the merge code is executed to merge data into target table.
If post-copy script option is provided in copy activity it will help to call the merge code from copy activity instead of calling another activity like Execute stored procedure.
304 votes -
Add a new email activity with the ability to send attachments as part of the workflow.
There are numerous instances when an output (statistics) or error file has to be mailed to administrators. Email as an activity will help in implementing this functionality
269 votes -
Allow parameters for pipeline reference name in ExecutePipeline activity
ExecutePipeline activity accept constant pipeline name. It should allow to invoke a pipeline dynamically by accepting parameter for pipeline reference name.
267 votes
- Don't see your idea?