Data Factory

Azure Data Factory allows you to manage the production of trusted information by offering an easy way to create, orchestrate, and monitor data pipelines over the Hadoop ecosystem using structured, semi-structures and unstructured data sources. You can connect to your on-premises SQL Server, Azure database, tables or blobs and create data pipelines that will process the data with Hive and Pig scripting, or custom C# processing. The service offers a holistic monitoring and management experience over these pipelines, including a view of their data production and data lineage down to the source systems. The outcome of Data Factory is the transformation of raw data assets into trusted information that can be shared broadly with BI and analytics tools.

Do you have an idea, suggestion or feedback based on your experience with Azure Data Factory? We’d love to hear your thoughts.

  1. Add ability to customize output fields from Execute Pipeline Activity

    This request comes directly from a StackOverflow post, https://stackoverflow.com/questions/57749509/how-to-get-custom-output-from-an-executed-pipeline .
    Currently, the output from the execute pipeline activity is limited to the pipeline's name and runId of the executed pipeline, making it difficult to pass any data or settings from the executed pipeline back to the parent pipeline - for instance, if a variable is set in the child pipeline, there is no in-built way to pass this variable in the Azure Data Factory UI. There exists a couple of workarounds as detailed in the above StackOverflow post, but adding this as an inbuilt feature would greatly enhance the ability…

    242 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    3 comments  ·  Flag idea as inappropriate…  ·  Admin →
  2. Please add support to specify longer timeout for Web Activity

    Data Factory version 2 currently supports Web Activities with a default timeout of 1 minute:

    https://docs.microsoft.com/en-us/azure/data-factory/control-flow-web-activity

    "REST endpoints that the web activity invokes must return a response of type JSON. The activity will timeout at 1 minute with an error if it does not receive a response from the endpoint."

    Please add ability to specify a longer timeout period for complex tasks.

    228 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    4 comments  ·  Flag idea as inappropriate…  ·  Admin →
  3. Allow setting the timezone for slices

    Currently slices run in UST, and if data sources are in other timezones, a simple DATEADD on the where clause will cause missed data when there is a DST chance.

    Additionally, adding DATEADD on every source is error prone, especially if a server changes their timezone in the future

    Allow us to set the timezone either on a pipeline level, a linked service level or a Dataset level -- any of these would do as long as ADF transparently translates SliceSlart and SliceEnd to the appropriate timezone

    210 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    7 comments  ·  Flag idea as inappropriate…  ·  Admin →
  4. Add a new email activity with the ability to send attachments as part of the workflow.

    There are numerous instances when an output (statistics) or error file has to be mailed to administrators. Email as an activity will help in implementing this functionality

    204 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  5. Change Data Capture feature for RDBMS (Oracle, SQL Server, SAP HANA, etc)

    Is it possible to have CDC features on ADF please ?

    193 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    planned  ·  11 comments  ·  Flag idea as inappropriate…  ·  Admin →
  6. Azure Data Factory - Restart an entire pipeline

    Currently in Azure Data Factory, there is no functionality to restart an entire Pipeline. If we need to refresh a dataset in Azure, all associated activities in the pipeline will have to be selected and run separately. Can we have an option where we could run the entire pipeline if required.

    187 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    5 comments  ·  Flag idea as inappropriate…  ·  Admin →
  7. Post-copy script in Copy Activity

    In copy activity there is a feature of pre-copy script. Similarly if there is post-copy script feature it will help to execute code post copy operation is completed from same activity.

    Traditionally when data is being copied from source sql to destination sql, the data is copied incrementally from source to temporary/stage tables/in-memory tables in destination. Post copy the merge code is executed to merge data into target table.

    If post-copy script option is provided in copy activity it will help to call the merge code from copy activity instead of calling another activity like Execute stored procedure.

    173 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    5 comments  ·  Flag idea as inappropriate…  ·  Admin →
  8. Allow parameters for pipeline reference name in ExecutePipeline activity

    ExecutePipeline activity accept constant pipeline name. It should allow to invoke a pipeline dynamically by accepting parameter for pipeline reference name.

    153 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    6 comments  ·  Flag idea as inappropriate…  ·  Admin →
  9. Event Hub

    Source and sink.

    144 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    6 comments  ·  Flag idea as inappropriate…  ·  Admin →
  10. Add PowerShell cmdlet support for import/export ARM Template

    currently there is no means to duplicate the functionality of the import and export ARM template in the adf.azure.com GUI. this is a major gap in the devops / CI/CD story. please add powershell cmdlets that allow import and export of just the ADF factory assets. optioanly provide the ability to include the creation of the data factory in the exported template.

    137 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    4 comments  ·  Flag idea as inappropriate…  ·  Admin →
  11. 125 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    3 comments  ·  Flag idea as inappropriate…  ·  Admin →
  12. Persist global temporary tables between activities

    It is currently not possible to access a global temporary table created by one activity from a subsequent activity.

    If this was possible you could create a pipeline with a Copy activity chained with a Stored Procedure activity with both accessing the same global temporary table. The benefit of this is that operations against database scoped temporary tables aren't logged, so you can load millions of records in seconds.

    111 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  13. NetSuite connector

    It would be great if there was a NetSuite connector

    111 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  14. Save git branch settings on server side/Factory wide

    The customer need to use specific branch for Data Factory resource but for now, branch setting is saved to cookie as <user>_preference and we have to answer the "Branch selection" dialogue every time the cache was cleared or accessing from the different machine/user.
    Please add a functionality to save this to Factory wide settings to avoid user error.

    107 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  15. Richer variable support

    Allow me to have custom variables at a pipeline and Factory level, which can be refreshed at a specified schedule from a dataset -- the closest analogue for this would be SSIS variables

    One use case for this would be for me to store a set of UTC offsets in a SQL table for each data source, and query this table at pipe runtime to retrieve the correct offset for each source. This offset can then be stored in variables for each pipeline

    107 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    4 comments  ·  Flag idea as inappropriate…  ·  Admin →
  16. Allow Stored Procedure Output Parameters

    In Logic Apps, it is possible to execute a stored procedure that contains an OUTPUT parameter, read the value of that output parameter and pass it to the next activity in the Logic App, this is not possible with ADF v2.

    102 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    4 comments  ·  Flag idea as inappropriate…  ·  Admin →
  17. Pause/Start Azure SQL Data Warehouse from ADF

    Pause/Start Azure SQL Data Warehouse from ADF

    101 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    5 comments  ·  Flag idea as inappropriate…  ·  Admin →
  18. Support encrypted flat files as the source in copy activities

    We use this approach to encrypt sensitive flat files at rest. Please add a feature to ADF to support reading from encrypted flat files in blob storage:
    https://azure.microsoft.com/en-us/documentation/articles/storage-encrypt-decrypt-blobs-key-vault/

    100 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    9 comments  ·  Flag idea as inappropriate…  ·  Admin →
  19. Add the ability to restart an activity from within a pipeline within a master pipeline in ADFv2

    If a pipeline structure is a master pipeline containing child pipelines with the activities held within these, it is not possible to restart the child pipeline and have the parent recognise when the child pipeline completes. Add the functionality to allow an activity in the child pipeline to be restarted that is then passed back to the parent pipeline when successfully completed.

    97 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    4 comments  ·  Flag idea as inappropriate…  ·  Admin →
  20. Fault Tolerance - Skip Incompatible Columns

    I am loading data dynamically from multiple SQL Server databases. All of the databases share the same table structure. However, there may be some exceptions where one of the databases has a missing column in one of the tables.

    When such a situation occurs, I would like to set the Fault Tolerance on to Skip the incompatible column. Meaning, instead of it failing or skipping all the rows (columns), it should skip the single column instead.

    That way, instead of losing all the columns, I lose one column, which is not used anyways since it never exists.

    96 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  • Don't see your idea?

Data Factory

Categories

Feedback and Knowledge Base