Data Factory

Azure Data Factory allows you to manage the production of trusted information by offering an easy way to create, orchestrate, and monitor data pipelines over the Hadoop ecosystem using structured, semi-structures and unstructured data sources. You can connect to your on-premises SQL Server, Azure database, tables or blobs and create data pipelines that will process the data with Hive and Pig scripting, or custom C# processing. The service offers a holistic monitoring and management experience over these pipelines, including a view of their data production and data lineage down to the source systems. The outcome of Data Factory is the transformation of raw data assets into trusted information that can be shared broadly with BI and analytics tools.

Do you have an idea, suggestion or feedback based on your experience with Azure Data Factory? We’d love to hear your thoughts.

  1. Save git branch settings on server side/Factory wide

    The customer need to use specific branch for Data Factory resource but for now, branch setting is saved to cookie as <user>_preference and we have to answer the "Branch selection" dialogue every time the cache was cleared or accessing from the different machine/user.
    Please add a functionality to save this to Factory wide settings to avoid user error.

    112 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  2. Parameterize Linked Service for dropdowns like Integration Runtime or Authentication Method (e.g. for databases)

    In Data Factory we designed very general pipelines to copy data from a specific source technology to ADLS.
    We created one Pipeline (including dataset and Linked Service) per technology (SQL Server, Oracle, ...).

    Now I have the case that my sources are just available using different Integration Runtimes or using different Access Methods (SQL Login/ Windows Authentication). As of now there is no way to parameterize the Integration Runtime or the Auth.Method to re-use the same Linked Service for all these cases. So I unfortunately need to duplicate my code just as this is can't be parameterized.

    Can you please…

    81 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  3. Data factory should be able to use VNet without resorting to self hosted

    Self hosted makes a lot of sense when integrating on-premise data, however it's a shame to need to maintain a self-hosted integration runtime VM when wishing to leverage the extra security of a VNet i.e. firewalled storage accounts etc.

    Ideally the azure managed integration runtimes would be able to join a vnet on demand.

    351 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    planned  ·  6 comments  ·  Flag idea as inappropriate…  ·  Admin →
  4. Allow MSI authentication for Azure SQL Database in Mapping Data Flow

    An Azure SQL Database Linked Service is authenticated with a Managed Identity (MSI) or a Service Principal. When authenticating with MSI, we can't use Mapping Data Flows. We get an error "AzureSqlDatabase does not support MSI authentication in Data Flow." Will this functionality be added?

    66 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  5. Improve performance of Copy Data Activity when dealing with a large number of small files

    The copy performance of the ADF Copy Data Activity going from a file system source to a Blob FileSystem or Blob source is quite slow and CPU intensive relative to other copy mechanisms available when copying a large number (tens of thousands to millions) of small files (<1MB).

    Both AzCopy & Azure Storage Explorer are able to complete the copy operations from the same source to the same sink approximately 3-5x faster while using less CPU than the ADF Copy Activity.

    At a minimum, we would like to see performance parity with AzCopy / Azure Storage Explorer.

    53 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  6. Bitbucket Integration

    We need to use bitbucket for a project. We are mirroring our azure devops repo with the pipelines to bitbucket. It would be easier if there was integration with bitbucket.

    355 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    9 comments  ·  Flag idea as inappropriate…  ·  Admin →
  7. Add the ability to restart an activity from within a pipeline within a master pipeline in ADFv2

    If a pipeline structure is a master pipeline containing child pipelines with the activities held within these, it is not possible to restart the child pipeline and have the parent recognise when the child pipeline completes. Add the functionality to allow an activity in the child pipeline to be restarted that is then passed back to the parent pipeline when successfully completed.

    84 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    4 comments  ·  Flag idea as inappropriate…  ·  Admin →
  8. Persist global temporary tables between activities

    It is currently not possible to access a global temporary table created by one activity from a subsequent activity.

    If this was possible you could create a pipeline with a Copy activity chained with a Stored Procedure activity with both accessing the same global temporary table. The benefit of this is that operations against database scoped temporary tables aren't logged, so you can load millions of records in seconds.

    60 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  9. Azure Data Factory - Google Analytics Connector

    Some customers have the necessity to extract information from Google Analytics in order to create a data lake or sql dw to gather marketing insights mixing another kind of data.

    Now we have some custom SSIS packages that are paid or developing some custom code.

    Or if it is not possible in Azure Data Factory, could have anoter way to extract this data in native connector in Azure … maybe Logic Apps

    402 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    12 comments  ·  Flag idea as inappropriate…  ·  Admin →
  10. Dark Theme for Azure Data Factory

    Recently I've been using the Azure Data Factory UI. Having now been used to having pretty much all Azure tooling I use capable of using Dark Themes, switching to Data Factory is a bit of a pain - the constant switch from dark to light is horrible. Also being Visually Impaired, this would be a major accessibility bonus!

    Please add the same theming as Azure Portal to the Data Factory UI.

    51 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  11. Web Activity and Rest Connector OAuth support

    The usefulness of the Web Activity and the REST Connector are hamstrung without OAuth support for authentication. Many 3rd party services require this to consume.

    35 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  12. Provide Query parameter support in Data Sources

    When using a query in a data set support for ado @ParamName and ODBC style ? parameters should be supported. for example SELECT * FROM dbo.Customers WHERE Mod_Date = ?. parameter support should be similar to what stored procedure offers with the ability to dynanicly detect the parameters based on the query. while you can use dynamic sql created within ADF this aproach is cumersome and leaves you open to sql injection attacks

    7 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  13. Better error logging and debug options for Dataflow

    There is no precise error logging for dataflow for few scenarios. I am using a simple single source file that is split into multiple files based on number of rows in the file. Sometimes i get a error called

    'Error: Dataflow execution failed due to user configuration error'

    The same code seems to run fine in other environments. So it is getting almost impossible to backtrack the error and fix. Any help would be appreciated.

    12 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  14. REST activity (linked service) should support Bearer token from key vault

    Currently the REST linked service only offers 3 options for "Authentication Type" (Basic, AAD Service Principal, and Managed Identity) this should be expanded with "Bearer" token HTTP header.
    This should work in combination with getting the Bearer token from a key vault secret.

    15 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  15. Refer to Azure Key Vault secrets in dynamic content

    If I need to crawl a restful API which is protected with an API key, the only way to set that key is by injecting an additional header on the dataset level. This key is stored in clear text, which is poor security.

    To make matters worse, if git integration is enabled, that key is even committed into version control.

    There should be a way to fetch values from Azure Key Vault elsewhere than just for setting up linked services. Alternatively, the REST linked service should support authentication with an API key.

    46 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  16. Add ability to customize output fields from Execute Pipeline Activity

    This request comes directly from a StackOverflow post, https://stackoverflow.com/questions/57749509/how-to-get-custom-output-from-an-executed-pipeline .
    Currently, the output from the execute pipeline activity is limited to the pipeline's name and runId of the executed pipeline, making it difficult to pass any data or settings from the executed pipeline back to the parent pipeline - for instance, if a variable is set in the child pipeline, there is no in-built way to pass this variable in the Azure Data Factory UI. There exists a couple of workarounds as detailed in the above StackOverflow post, but adding this as an inbuilt feature would greatly enhance the ability…

    22 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  17. Allow choosing logical AND or logical OR in activity dependencies

    We have activity dependencies today, but they are always logical AND. If we have Activity1 -> Activity 2 -> Activity3 and we want to say if any of these activities fail, run activity 4, it isn't straight forward. In SSIS, we can choose an expression and choose whether we need one or all conditions to be true when there are multiple constraints. We need similar functionality here. It can be achieved with a bit of creativity (repeat the failure activity as the single failure path after each of the original activities use the If Condition to write logic that would…

    287 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  18. Support pulling storage account key from Azure Key Vault (not from a secret)

    When you setup Key Vault to periodically rotate the storage account key, it stores the key not in a secret but under a URI similar to https://<keyvault>.vault.azure.net/storage/<storageaccountname>

    The setup instructions for this automatic key rotation are here:
    https://docs.microsoft.com/en-us/azure/key-vault/key-vault-ovw-storage-keys#manage-storage-account-keys

    Please enhance Azure Data Factory so that you can pull the storage account key for use in a linked service from this place in Azure Key Vault. Currently ADF only supports pulling from secrets, not from storage keys in key vault.

    47 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  19. 121 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    3 comments  ·  Flag idea as inappropriate…  ·  Admin →
  20. Allow MSI authentication for AzureDataLakeStore in Mapping Data Flow

    An ADLS (gen 1) Linked Service is authenticated with a Managed Identity (MSI) or a Service Principal. When authenticating with MSI, we can't use Mapping Data Flows. Will this functionality be added?

    44 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
← Previous 1 3 4 5 38 39
  • Don't see your idea?

Data Factory

Categories

Feedback and Knowledge Base