Data Factory
Azure Data Factory allows you to manage the production of trusted information by offering an easy way to create, orchestrate, and monitor data pipelines over the Hadoop ecosystem using structured, semi-structures and unstructured data sources. You can connect to your on-premises SQL Server, Azure database, tables or blobs and create data pipelines that will process the data with Hive and Pig scripting, or custom C# processing. The service offers a holistic monitoring and management experience over these pipelines, including a view of their data production and data lineage down to the source systems. The outcome of Data Factory is the transformation of raw data assets into trusted information that can be shared broadly with BI and analytics tools.
Do you have an idea, suggestion or feedback based on your experience with Azure Data Factory? We’d love to hear your thoughts.
-
Configure for singleton Pipeline Run
For wall clock trigger schedule, should have some property by which we can control whether to allow new run of pipeline if a previous run already in progress.
53 votes -
Support encrypted flat files as the source in copy activities
We use this approach to encrypt sensitive flat files at rest. Please add a feature to ADF to support reading from encrypted flat files in blob storage:
https://azure.microsoft.com/en-us/documentation/articles/storage-encrypt-decrypt-blobs-key-vault/52 votesThanks for the feedback. We are looking at this.
-
Ability to use sharepoint online document library as a source in order to copy the files into Azure Blob.
Ability to use sharepoint online document library as a source in order to copy the files into Azure Blob.
50 votes -
Support pulling storage account key from Azure Key Vault (not from a secret)
When you setup Key Vault to periodically rotate the storage account key, it stores the key not in a secret but under a URI similar to https://<keyvault>.vault.azure.net/storage/<storageaccountname>
The setup instructions for this automatic key rotation are here:
https://docs.microsoft.com/en-us/azure/key-vault/key-vault-ovw-storage-keys#manage-storage-account-keysPlease enhance Azure Data Factory so that you can pull the storage account key for use in a linked service from this place in Azure Key Vault. Currently ADF only supports pulling from secrets, not from storage keys in key vault.
50 votes -
Rename objects in the portal
Provide the ability to rename all objects and update their associated scripts. Right now deleting a dataset removes its slice history which can get very problematic.
The ability to update the dataset's name and availability without having to recreate it would be very useful.
50 votes -
44 votes
-
Add a feature to copy always encrypted column data to always encrypted column of another Azure SQL databse
Add a feature to copy always encrypted column data to always encrypted column of another Azure SQL database
44 votes -
Make "retryIntervalInSeconds" parameter able to accept dynamic values
Currently the "retryIntervalInSeconds" parameter is only able to accept integer values, not pipeline variables that are integer values.
43 votes -
Allow pipeline schedule to skip if already it is running (ADF V2)
Please add a feature to skip the schedule if the current schedule is already running.
For example I have a pipeline schedule for every 1 minute, if the pipeline is still running, the next schedule will start which causes the overlap in pipeline execution.
Right now I'm updating some records in SQL table which takes time until then the next schedule is starting which is again updating the same records because the previous pipeline schedule execution is not completed.
41 votes -
Run containers through Data Factory custom activity
It is currently not possible to pull down docker images and run those as tasks through Data Factory, even though this is already possible through Batch itself.
40 votes -
Data Catalog integration
If Data Catalog (ADC) is to be our metadata store for helping users explore data sets it occurs to me that there out to be some sort of integration with ADF so that new data sets appear automatically and their refresh status is available so that end users know data is up to date etc. I also notice there is no specific feedback category for ADC.
Also ADF should be able to consume data sets in ADC by populating the appropriate linked services and table38 votesThank you so much for your great suggestion! We will consider this as part of our roadmap planning.
-
Allow MSI authentication for AzureDataLakeStore in Mapping Data Flow
An ADLS (gen 1) Linked Service is authenticated with a Managed Identity (MSI) or a Service Principal. When authenticating with MSI, we can't use Mapping Data Flows. Will this functionality be added?
37 votes -
Azure Data Factory Dynamics 365 connector/dataset complex types
- How to nominate Dynamics 365 alternative key for use with “Upsert” sink. Eg. Account.Accountnumber
- Using sink to set “Lookup” types – when will this be available? (Ability to set CRM “EntityReference” types.) This is an URGENT requirement for ALL CRM integrations.
- Using sink to set “Owner” - when will this be available? This technically is the same as “Lookups”.
37 votes -
Dark Theme for Azure Data Factory
Recently I've been using the Azure Data Factory UI. Having now been used to having pretty much all Azure tooling I use capable of using Dark Themes, switching to Data Factory is a bit of a pain - the constant switch from dark to light is horrible. Also being Visually Impaired, this would be a major accessibility bonus!
Please add the same theming as Azure Portal to the Data Factory UI.
37 votes -
create databricks cluster and that single cluster can be used in multiple databricks activity
Hi,
I am searching for the feature in data factory for databricks activity, suppose there is pipeline and in that pipeline there are multiple databricks activity, as of now i can make use of new job cluster to execute all the databricks activities but by doing this spin up the cluster and terminate the cluster for each activity is taking lot of time, i would like to have a functionality where i can create a cluster at the begining of the pipeline and all activities make use of the existing cluster and at the end we can terminate the cluster.…
37 votes -
Refer to Azure Key Vault secrets in dynamic content
If I need to crawl a restful API which is protected with an API key, the only way to set that key is by injecting an additional header on the dataset level. This key is stored in clear text, which is poor security.
To make matters worse, if git integration is enabled, that key is even committed into version control.
There should be a way to fetch values from Azure Key Vault elsewhere than just for setting up linked services. Alternatively, the REST linked service should support authentication with an API key.
37 votes -
Add retry policy to webhook activity
Right now it is not possible to retry a Webhook activity. Sometimes these activities fail due 'bad request' or other issues that can easily retried by manually re-running it. However this is so far manual.
36 votes -
Implement functionality to rename Linked services (and possibly other components)
Currently, the only way to rename Linked Services and other components is to delete and recreate the linked service. Doing this then requires each assosciated dataset to be updated manually.
Functionality to rename this within the GUI tool would add value in allowing these components to be renamed with the confidence it will not break anything.Whilst it is possible to edit the JSON by hand, when I tried this and uploaded back into the GIT repository, it broke the connections. The behind the scenes magic seems not able to handle it.
36 votes -
Support SQL Database Always Encrypted sources or destinations
With the recent increase with privacy and security concerns, namely GDPR, the need for using Always Encrypted on SQL Server or Azure SQL Database is also increasing. The problem is that in the moment that we enable this security features in SQL we can't use ADF anymore as the Dara Flow orchestration. Without this feature more secure enterprise scenarios are being left out.
36 votes -
Reading XML and XLS directly using ADF components for ADF V2
Reading XML and XLS directly using ADF components for ADF V2
36 votes
- Don't see your idea?