It is not currently possible to identify the IP Address of the DF, which you need for firewall rules, including Azure SQL Server firewall....1,958 votes
Great news – static IP range for Azure Integration Runtime is now available in all ADF regions! You can whitelist specific IP ranges for ADF as part of firewall rules. The IPs are documented here: https://docs.microsoft.com/en-us/azure/data-factory/azure-integration-runtime-ip-addresses#azure-integration-runtime-ip-addresses-specific-regions. Static IP ranges for gov cloud and China cloud will be published soon!
Please refer to this blog post on how you can use various mechanisms including trusted Azure service and static IP to secure data access through ADF:
Service tag support will be made available in next few weeks. Please stay tuned!
If your network security requirement calls for ADF support for VNet and cannot be met using Trusted Azure service (released in Oct 2019), static IP range (released in Jan 2020), or service tag (upcoming), please vote for VNet feature here: https://feedback.azure.com/forums/270578-data-factory/suggestions/37105363-data-factory-should-be-able-to-use-vnet-without-re
Add excel file as source.855 votes
We are working on adding support for Excel as source format in Azure Data Factory Copy activity and Mapping Data Flow. Thanks for all the feedback and please stay tuned!
can we have a copy activity for XML files, along with validating schema of an XML file against XSD.. this would be helpful.. if schema validation is success then copy else fail the activity.. this will be useful for below scenarios..
1. Blob to Blob
2. Blob to SQL
3. SQL to Blob
if all above can work with specified schema that would be great...679 votes
Thanks all for the feedback. We start to work on adding support for XML as source format in Azure Data Factory Copy activity and Mapping Data Flow. Please stay tuned.
Power Query is awesome! It would be a great feature to be able to output its result into either a SQL database or Azure (Storage or SQL).474 votes
Please check the new capability we recently unveiled called Wrangling Data Flows, available in preview! Wrangling Data Flow allows you to discover and explore your data using the familiar Power Query Online mashup editor to do data preparation, and then execute at scale using Spark runtime.
Sign up for preview access at: https://forms.office.com/Pages/ResponsePage.aspx?id=v4j5cvGGr0GRqy180BHbR9-OHbkcd7NIvtztVhbGIU9UNk5QM0dSWkFDSkFPUlowTFJMRVZUUUZGRi4u and check out more details at https://aka.ms/wranglingdfdocs
Provide the capability to copy data from Blob to Snowflake data warehouse440 votes
Azure Data Factory now supports Snowflake connector as source and sink in copy activity. Data Flow support will come later. Learn more from
the web and odata connectors need to add support for OAuth ASAP. Most other Microsoft services (Office 365, PWA, CRM, etc, etc, etc) along with many other industry API's require the use of OAuth. Not having this closes the door to lots of integration scenarios.337 votes
Now OData connector support AAD-based OAuth, you can try it out via copy wizard. And more details on OData connector can be found at https://azure.microsoft.com/en-us/documentation/articles/data-factory-odata-connector/. Keep this feedback live – if you need OAuth for Web, please leave comment.
Provide better cost analysis possibilities (either in azure portal or in adf). Right now it is impossible to see costs by pipelines or activities - you can only see overall cost of whole data factory instance which is quite not useful.
Please add billing per pipeline (logic apps is a good example where you can track costs per each logic app)38 votes
Thank you for your great feedback! We have started working on enabling this and will keep you updated as it becomes available!
As far as I know, IP address of Data Flow cannot be specified and also Data Flow isn't included trusted Microsoft services.
To enhance security on Azure SQL DB, Azure Storage etc.. Please consider adding features.21 votes
Currently, it's only possible to do numerical aggregations (count, sum, etc) in data flow aggregations. Implementing something that works like SQL string_agg would be very helpful.16 votes
It would be nice to have a sample where we can use Data Factory in an IoT scenario to get started more quickly.
I would really appreciate this!8 votes
More file formats should be allowed, could not see copy to azure blob support PDF,Word,Images formats and more others.
It would be really great if we could have some process in place to read PDF, Word, Images (unstructured data).7 votes
You can currently copy any file format via the copy activity. Simply do not provide the structure element in the dataset. But we do want to surface this in a first class manner.
Currently, a pipeline query in the portal editor can only be one line, with no syntax highlighting.
This makes it hard to read and edit, easy to introduce errors (particularly when escaping characters), and hard to spot them.
Please add syntax highlighting, and allow the query to span multiple lines (even when in a Text.Format macro).6 votes
Thanks for your feedback. We are working on an authoring experience that will allow you to use the syntax highlighting. For the query spanning multiple lines, you can store your query in your storage account and refer the path in ‘scriptpath’ parameter. This will allow your query to span multiple lines while using ‘Text.Format’.
At present you need co-admin rights to deploy. Businesses cannot give out these rights. As a subscription owner I should be able to deploy from VS as these rights give me access in the portal to create and delete!5 votes
Team are working on relaxing such constraints. Please stay tuned.
While doing aggregation of column using aggregation transformation it is not allowed to aggregate all the columns having text inside it.
Similar functionality is available in pyspark.
For example below is specific pyspark code which can not be transformed using Azure data flow.
df.groupby(['customerid', 'month', 'year']).agg(F.concatws(", ", F.collectlist(df.text)).alias('aggdescr'),F.min('minbalance').alias('balance'),3 votes
It's very hard to get a clear understanding of the cost of executing a pipeline/a dataflow. It would be very helpful with cost estimates for this based on the current configuration (runtime, etc.) Furthermore, a projection of cost akin to that in the Azure platform would be very helpful. Also, it would be very helpful on a given runtime to see how many clusters are alive on it and be able to shut them down individually.2 votes
- Don't see your idea?