Data Factory

Azure Data Factory allows you to manage the production of trusted information by offering an easy way to create, orchestrate, and monitor data pipelines over the Hadoop ecosystem using structured, semi-structures and unstructured data sources. You can connect to your on-premises SQL Server, Azure database, tables or blobs and create data pipelines that will process the data with Hive and Pig scripting, or custom C# processing. The service offers a holistic monitoring and management experience over these pipelines, including a view of their data production and data lineage down to the source systems. The outcome of Data Factory is the transformation of raw data assets into trusted information that can be shared broadly with BI and analytics tools.

Do you have an idea, suggestion or feedback based on your experience with Azure Data Factory? We’d love to hear your thoughts.

  1. Execute Pipeline activity automatic rerun

    Possibility to automatically rerun the related pipeline when a failure occurs.

    This is to help cases where a single activity rerun will not get the pipeline on track, for example, when data must be submitted again from the beginning. In these cases, it might be necessary to rerun the complete pipeline.

    As of today, the Execute Pipeline activity does not have possibility to specify the number of retries that can be executed before the activity is set to failed.

    The workaround to implement a solution involves several components and seems unnecessarily complex.

    The attached picture describes a linear pipeline including…

    82 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  2. Clear errors and "unused" data slices

    There should be a option to clear old errors.
    When there is no pipeline that produces or consumes a data slice, and this slice has errors the counter still shows "current" errors, and this is not the case. I would like to remove these unused slices and their errors.

    82 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  3. Support Azure app service API

    Can it consume or push data to Azure app service API? Supporting Swagger API.

    79 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  4. Publish Azure Data Factory ARM templates to a custom folder in the publish branch

    Provide the ability to publish Azure Data Factory ARM templates to a custom folder in the publish branch. An additional property could be added to the publish_config.json file in order to cater for this e.g.

    {
    "publishBranch":"release/adf_publish",
    "publishFolder":"Deployment/ARM/"
    }

    https://docs.microsoft.com/en-us/azure/data-factory/source-control#configure-publishing-settings

    77 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    3 comments  ·  Flag idea as inappropriate…  ·  Admin →
  5. Run containers through Data Factory custom activity

    It is currently not possible to pull down docker images and run those as tasks through Data Factory, even though this is already possible through Batch itself.

    https://github.com/MicrosoftDocs/azure-docs/issues/16473

    75 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    2 comments  ·  Flag idea as inappropriate…  ·  Admin →
  6. Google Sheets connector

    Hello,

    It would be great and very useful in my opinion if there was a Google Sheets connector.

    Thanks in advance.

    74 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    7 comments  ·  Flag idea as inappropriate…  ·  Admin →
  7. Support PATCH method in Web Activity

    Some Azure REST APIs and other third parties APIs use the PATCH method.

    Please add support for this method or make the method parameter a string so that we can use any method.

    73 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    3 comments  ·  Flag idea as inappropriate…  ·  Admin →
  8. GitLab Integration in Azure Data Factory

    Will be useful to have GitLab integration in Azure Data Factory along with GitHub and Azure Repos as it's one of the most popular tools

    71 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  9. Support for Elastic database transactions

    ADF must support Elastic database transactions towards Azure SQL Database.

    This is equivalent to the on-premise scenario, where SSIS transactions use MSDTC towards SQL Server.

    Currently, if you set TransactionOption=Required on a data flow, and use an OLEDB connection to an Azure SQL Database, you receive an error like:
    "The SSIS runtime has failed to enlist the OLE DB connection in a distributed transaction with error 0x80070057 "The parameter is incorrect".

    70 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    3 comments  ·  Flag idea as inappropriate…  ·  Admin →
  10. Web Activity should support JSON array response

    When a Web Activity calls an API that returns a JSON array as the response we get an error that says "Response Content is not a valid JObject". Please support JSON arrays as the top level of the response.

    68 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    2 comments  ·  Flag idea as inappropriate…  ·  Admin →
  11. Allow MSI authentication for AzureDataLakeStore in Mapping Data Flow

    An ADLS (gen 1) Linked Service is authenticated with a Managed Identity (MSI) or a Service Principal. When authenticating with MSI, we can't use Mapping Data Flows. Will this functionality be added?

    67 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  12. Improve performance of Copy Data Activity when dealing with a large number of small files

    The copy performance of the ADF Copy Data Activity going from a file system source to a Blob FileSystem or Blob source is quite slow and CPU intensive relative to other copy mechanisms available when copying a large number (tens of thousands to millions) of small files (<1MB).

    Both AzCopy & Azure Storage Explorer are able to complete the copy operations from the same source to the same sink approximately 3-5x faster while using less CPU than the ADF Copy Activity.

    At a minimum, we would like to see performance parity with AzCopy / Azure Storage Explorer.

    64 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  13. Add Support for Apache Kakfa

    Add support for Apache Kafka Producer, Consumer, streams and KSQL API's into Azure Data Factory

    62 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  14. In Azure Data Factory pipeline level alerts are required,Pipeline may have many activities, But single alert email should come once execute

    In Azure Data Factory pipeline level alerts are required,Pipeline may have many activities(Since activity level alerts are available now . mailbox will be filled with alert emails) , So single alert email should come once the pipeline is executed

    61 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  15. Data factory Pipeline to have webhook execution

    It will be great if Azure data factory jobs could be executed/run by webhooks using their default schedule.
    The current limitation to re-run via powershell or Azure portal is not that graceful for production environment and to be automated.
    Ideally, if the job could run on http post to the webhook will be great! and will resolve many automation challenges.
    Potentially this could be integrated in Azure Logic Apps.

    60 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    2 comments  ·  Flag idea as inappropriate…  ·  Admin →
  16. 60 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    planned  ·  7 comments  ·  Flag idea as inappropriate…  ·  Admin →
  17. Configure for singleton Pipeline Run

    For wall clock trigger schedule, should have some property by which we can control whether to allow new run of pipeline if a previous run already in progress.

    59 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    3 comments  ·  Flag idea as inappropriate…  ·  Admin →
  18. Remove output limitations on Web and Azure Function activities

    Currently if you make a call to a web API and the JObject returned is greater than 1mb in size then the activity fails with the error:

    "The length of execution ouput is over limit (around 1M currently). "

    This is a big limitation and would be great if it were removed or increased.

    58 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    4 comments  ·  Flag idea as inappropriate…  ·  Admin →
  19. support web linking with rest api pagination

    REST API pagination needs to support RFC 5988 style links in the header.

    Examples are ServiceNow and Greenhouse.

    See: https://tools.ietf.org/html/rfc5988#page-6 for RFC
    See: https://stackoverflow.com/questions/54589413/azure-data-factory-rest-api-to-service-now-pagination-issue for a related stack overflow question

    Greenhouse link header example:

    link →<https://harvest.greenhouse.io/v1/applications?page=2&perpage=100>; rel="next",<https://harvest.greenhouse.io/v1/applications?page=129&perpage=100>; rel="last"

    Need to grab the 'next' url which is not currently possible with pagination support:
    https://docs.microsoft.com/en-us/azure/data-factory/connector-rest#pagination-support

    Only way around this seems to be go outside data factory to fetch the data (e.g. databricks python) which defeats the purpose.

    56 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    4 comments  ·  Flag idea as inappropriate…  ·  Admin →
  20. Web Activity and Rest Connector OAuth support

    The usefulness of the Web Activity and the REST Connector are hamstrung without OAuth support for authentication. Many 3rd party services require this to consume.

    54 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  • Don't see your idea?

Data Factory

Categories

Feedback and Knowledge Base