Data Factory

Azure Data Factory allows you to manage the production of trusted information by offering an easy way to create, orchestrate, and monitor data pipelines over the Hadoop ecosystem using structured, semi-structures and unstructured data sources. You can connect to your on-premises SQL Server, Azure database, tables or blobs and create data pipelines that will process the data with Hive and Pig scripting, or custom C# processing. The service offers a holistic monitoring and management experience over these pipelines, including a view of their data production and data lineage down to the source systems. The outcome of Data Factory is the transformation of raw data assets into trusted information that can be shared broadly with BI and analytics tools.

Do you have an idea, suggestion or feedback based on your experience with Azure Data Factory? We’d love to hear your thoughts.

  1. trigger next run time

    Add the next run time to the triggers view. Particularly when updating code, pushing new... the calcs that are done when the start time is in the past and how it calculates the next occurrence, it would be nice if the trigger screen (or the trigger run screen) included the next run date/time.

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  2. ADF: GCS authentication using JSON key pair

    In the current version, ADF allows only HMAC key authentication which is not very intuitive especially in multi factory setup and length dev/test cycles. We should also support service account authentication protocol using JSON key pair, which will also allow neater integration with key vault.

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  3. Resize Azure SSIS intergration runtime require delete lock removed from un-related VM

    A support ticket 119072324003920 has been opened for this issue.

    When tried to resize Azure SSIS integration runtime, during VNet validation, it complains detection of a delete lock on Resource Group, where the VNet resides in.

    The delete lock belongs to a VM, that is within the same Resource Group, but this VM has nothing to do with this IR, the IR is with Data Factory from totally different Resource Group.

    User with permission to the IR doesn't have permission to remove the delete lock from the VM that belongs to VNET resource group, so this prevents users to have…

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  4. sample python code to copy mysql data to azure blob

    I have followed this link https://docs.microsoft.com/en-us/azure/data-factory/quickstart-create-data-factory-python and created the pipeline to copy data from Azure blob to blog.
    Then I tried to extend this code to copy data from mysql table and store to azure blob.
    I'm creating the linked services by using: lsmysql = AzureMySqlLinkedService(connectionstring=mysqlstring)
    and creating dataset like this: ds
    azuremysql = AzureMySqlTableDataset(dsls,table_name=table1)
    And I'm getting the following exception:
    "Errors: 'AzureMySqlTable' is not a supported data set type for Azure Storage linked service. Supported types are 'AzureBlob' and 'AzureTable'"
    Can you help to identify the correct method to use here?
    Or any API…

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  5. 1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  6. Azure integrated runtime (IR) should not go offline when a large number of tables are copied in parallel

    A major blocking issue is that the IR goes "offline" if a large number of tables are copied in parallel. The IR should queue all requests instead of going offline when tables are being copied in parallel.

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  7. The error message that pops up on the initial failure goes away faster than I can copy and paste it to a text editor. This is not a thing I

    The error messages that Data Factory pops up when a pipeline fails in debug go away too quickly. You can not read it.The error message that pops up on the initial failure goes away faster than I can copy and paste it to a text editor. I just wanted to call it out as bad UX so that you could fix it

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  8. support to write to event grid

    support to write to event grid using a standard task instead of calling a webhook to post. this would help scenarios where application pipelines need to integrate based on events.

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  9. Allow JSON to be imported directly to the ADF editor

    As a developer I want to import existing JSON code into the ADF GUI editor for an complete pipeline. The ADF editor would transform the JSON into visual editable form the objects and their associated properties so that I could continue visually editing/publishing to leverage a library of reusable JSON code that I have in Git.

    Bonus points if you integrate all Git functions into ADF similar to Visual Studio

    This would enable standard libraries for common tasks as templates that could be easily reused and shared

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  10. Trigger for events in Azure File Share

    Currently there are no triggers for events happening in file share, so we can't readily analyze the files coming in file share. We should be having feature to trigger the functions on the events on file share.

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  11. Add ability to publish selected changes, to avoid dependencies from locking up the publish.

    For example, I change a pipeline to use a different data source, and I seek to delete the data source at the same time, it will identify the data source as being used in the pre-published object and I have to back out changes and do them in steps.

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  12. Changing pipeline should not change history run

    if you change a pipeline and then go to the history of runs of that pipeline (that never ran) it will show the current code on the graphic side with the note "Pipeline was modified after this run. The current pipeline configuration is shown." instead of the previous configuration.

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  13. Sort out the filters on the ADFv2 monitor pipeline page

    Please make it obvious which filters have been applied in the pipeline monitor window. It is not clear when a filter is set and is easy to miss.

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  14. Get Metadata Activity

    Currently Get Metadata Activity in ADF has a max limit of 1MB. Any chance this could be increased. We are loading files from an AWS folder and folder has thousands of files. Retrieving filenames is exceeding this limit! This is very urgent if you could help!

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  15. Clarify ADF replace function documentation

    Does the ADF replace function replace the FIRST occurence of the old_string, or ALL occurrences? The documentation isn't clear.

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  16. Support to more than 10 concurrent slices per dataset

    Support to more than 10 concurrent slices per dataset as history loads takes longer time.

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  17. Add "Execute SQL Script" node to Azure Data Factory

    Add "Execute SQL Script" node to Azure Data Factory

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  18. How to import data to psql from azure storage account

    is there alternative to import data to azure psql?

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  19. Save Filters in Pipeline Run Monitoring

    I have a default set of filters that I use to monitor my pipelines. Occasionally, I need to view my pipelines using an alternate set of filters. I would like to not have to recreate my filter set every time I want to return to my default view.

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  20. In Event Trigger–Add Dynamic Content feature (i.e. Parameterization) should be really helpful.

    Event Trigger is used to trigger the Azure Data Factory pipeline when some file is placed at some location. It is working as expected and triggering pipeline.
    There is requirement to control File name, Container etc which is used to trigger pipeline. This can be achieved when these options can be parameterized and passed somehow. Somehow “Add Dynamic Content” which allows to use parameters is not available.
    Or Add functionality in Wait activity which should check for availability of file/container within stipulated/configured time.
    Such feature is required as available in other ETLs.

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  • Don't see your idea?

Data Factory

Categories

Feedback and Knowledge Base