Data Factory

Azure Data Factory allows you to manage the production of trusted information by offering an easy way to create, orchestrate, and monitor data pipelines over the Hadoop ecosystem using structured, semi-structures and unstructured data sources. You can connect to your on-premises SQL Server, Azure database, tables or blobs and create data pipelines that will process the data with Hive and Pig scripting, or custom C# processing. The service offers a holistic monitoring and management experience over these pipelines, including a view of their data production and data lineage down to the source systems. The outcome of Data Factory is the transformation of raw data assets into trusted information that can be shared broadly with BI and analytics tools.

Do you have an idea, suggestion or feedback based on your experience with Azure Data Factory? We’d love to hear your thoughts.

  1. Support Azure app service API

    Can it consume or push data to Azure app service API? Supporting Swagger API.

    78 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  2. Add PowerShell cmdlet support for import/export ARM Template

    currently there is no means to duplicate the functionality of the import and export ARM template in the adf.azure.com GUI. this is a major gap in the devops / CI/CD story. please add powershell cmdlets that allow import and export of just the ADF factory assets. optioanly provide the ability to include the creation of the data factory in the exported template.

    74 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    3 comments  ·  Flag idea as inappropriate…  ·  Admin →
  3. Support PATCH method in Web Activity

    Some Azure REST APIs and other third parties APIs use the PATCH method.

    Please add support for this method or make the method parameter a string so that we can use any method.

    72 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    2 comments  ·  Flag idea as inappropriate…  ·  Admin →
  4. Elasticsearch

    source and sink.

    68 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  5. Web Activity should support JSON array response

    When a Web Activity calls an API that returns a JSON array as the response we get an error that says "Response Content is not a valid JObject". Please support JSON arrays as the top level of the response.

    67 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    2 comments  ·  Flag idea as inappropriate…  ·  Admin →
  6. Setting Expiration in Azure Data Lake Store

    Ability to set Expiration of Files at the Data Lake Store when the file is created during a Copy Pipeline.

    66 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    7 comments  ·  Flag idea as inappropriate…  ·  Admin →
  7. Support extracting contents from TAR file

    My source gives me a file that is compressed and packaged as .tar.gz. Currently Azure data factory can only handle the decompression step, and not unpacking the tar file. I think will have to write a custom activity to handle this now.

    66 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  8. Capture Databricks Notebook Return Value

    In Data Factory it is not possible to capture the return from a Databricks notebook and send the return value as a parameter to the next activity.

    This forces you to store parameters somewhere else and look them up in the next activity.

    64 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    4 comments  ·  Flag idea as inappropriate…  ·  Admin →
  9. Support for Elastic database transactions

    ADF must support Elastic database transactions towards Azure SQL Database.

    This is equivalent to the on-premise scenario, where SSIS transactions use MSDTC towards SQL Server.

    Currently, if you set TransactionOption=Required on a data flow, and use an OLEDB connection to an Azure SQL Database, you receive an error like:
    "The SSIS runtime has failed to enlist the OLE DB connection in a distributed transaction with error 0x80070057 "The parameter is incorrect".

    63 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    3 comments  ·  Flag idea as inappropriate…  ·  Admin →
  10. Get Metadata for Multiple Files Matching Wildcard

    The Get Metadata Activity is not useful when there is a wild card in the dataset file path. Could the Get MetaData Activity be expanded to return the number of files found in the dataset, and an array of metadata values? We want to use this information to decide whether to continue with the remainder of the pipeline based on whether any files satisfy the wild card.

    61 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  11. Allow parameters for pipeline reference name in ExecutePipeline activity

    ExecutePipeline activity accept constant pipeline name. It should allow to invoke a pipeline dynamically by accepting parameter for pipeline reference name.

    60 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    2 comments  ·  Flag idea as inappropriate…  ·  Admin →
  12. In Azure Data Factory pipeline level alerts are required,Pipeline may have many activities, But single alert email should come once execute

    In Azure Data Factory pipeline level alerts are required,Pipeline may have many activities(Since activity level alerts are available now . mailbox will be filled with alert emails) , So single alert email should come once the pipeline is executed

    60 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  13. Data factory Pipeline to have webhook execution

    It will be great if Azure data factory jobs could be executed/run by webhooks using their default schedule.
    The current limitation to re-run via powershell or Azure portal is not that graceful for production environment and to be automated.
    Ideally, if the job could run on http post to the webhook will be great! and will resolve many automation challenges.
    Potentially this could be integrated in Azure Logic Apps.

    59 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  14. Allow Stored Procedure Output Parameters

    In Logic Apps, it is possible to execute a stored procedure that contains an OUTPUT parameter, read the value of that output parameter and pass it to the next activity in the Logic App, this is not possible with ADF v2.

    56 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  15. Ability to name activities in Data Factory dynamically

    When copying data from e.g. Azure SQL db to Azure DWH, you may want to use the FOREACH iterator similar to the pattern described in the tutorial at https://docs.microsoft.com/en-us/azure/data-factory/tutorial-bulk-copy.
    The downside of the this approach is that the logging in the monitor window is somewhat useless because you cannot see what activity has failed because they are all named the same (but with different RunId's, of course).

    If would be much better if the name of the activity could be named during runtime, e.g CopyTableA, CopyTableB, CopyTableC instead of CopyTable, CopyTable, CopyTable.

    55 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    9 comments  ·  Flag idea as inappropriate…  ·  Admin →
  16. Amazon S3 sink

    We'd really need the ability to write to S3 rather than just read.

    Many larger clients (big groups with multiple IT departments) will often have both Azure and Amazon and ADF is getting disqualified from the benchmarks against Talend Online and Matilion because won't push to other cloud services...

    Please help ^^

    53 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    3 comments  ·  Flag idea as inappropriate…  ·  Admin →
  17. Talk to the O365 OneRM team and get a copy of their O365DataTransfer program -- it does a lot of things DF needs to do.

    O365 OneRM team is solving the same problems that you are, and has a mature platform that does many of the things that Azure DF will need to do. Talk to Zach, Naveen, and Karthik in building 2. Also talk to Pramod. It'll accelerate you in terms of battle-hardened user needs and things pipeline automation has needed to do in the field. You'll want to get a copy of the bits/code.

    53 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  18. Support encrypted flat files as the source in copy activities

    We use this approach to encrypt sensitive flat files at rest. Please add a feature to ADF to support reading from encrypted flat files in blob storage:
    https://azure.microsoft.com/en-us/documentation/articles/storage-encrypt-decrypt-blobs-key-vault/

    52 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    2 comments  ·  Flag idea as inappropriate…  ·  Admin →
  19. Ability to use sharepoint online document library as a source in order to copy the files into Azure Blob.

    Ability to use sharepoint online document library as a source in order to copy the files into Azure Blob.

    50 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  20. Rename objects in the portal

    Provide the ability to rename all objects and update their associated scripts. Right now deleting a dataset removes its slice history which can get very problematic.

    The ability to update the dataset's name and availability without having to recreate it would be very useful.

    50 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    5 comments  ·  Flag idea as inappropriate…  ·  Admin →
  • Don't see your idea?

Data Factory

Categories

Feedback and Knowledge Base