Data Factory

Azure Data Factory allows you to manage the production of trusted information by offering an easy way to create, orchestrate, and monitor data pipelines over the Hadoop ecosystem using structured, semi-structures and unstructured data sources. You can connect to your on-premises SQL Server, Azure database, tables or blobs and create data pipelines that will process the data with Hive and Pig scripting, or custom C# processing. The service offers a holistic monitoring and management experience over these pipelines, including a view of their data production and data lineage down to the source systems. The outcome of Data Factory is the transformation of raw data assets into trusted information that can be shared broadly with BI and analytics tools.

Do you have an idea, suggestion or feedback based on your experience with Azure Data Factory? We’d love to hear your thoughts.

  1. Please add support to specify longer timeout for Web Activity

    Data Factory version 2 currently supports Web Activities with a default timeout of 1 minute:

    https://docs.microsoft.com/en-us/azure/data-factory/control-flow-web-activity

    "REST endpoints that the web activity invokes must return a response of type JSON. The activity will timeout at 1 minute with an error if it does not receive a response from the endpoint."

    Please add ability to specify a longer timeout period for complex tasks.

    244 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    4 comments  ·  Flag idea as inappropriate…  ·  Admin →
  2. Allow setting the timezone for slices

    Currently slices run in UST, and if data sources are in other timezones, a simple DATEADD on the where clause will cause missed data when there is a DST chance.

    Additionally, adding DATEADD on every source is error prone, especially if a server changes their timezone in the future

    Allow us to set the timezone either on a pipeline level, a linked service level or a Dataset level -- any of these would do as long as ADF transparently translates SliceSlart and SliceEnd to the appropriate timezone

    213 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    7 comments  ·  Flag idea as inappropriate…  ·  Admin →
  3. Add a new email activity with the ability to send attachments as part of the workflow.

    There are numerous instances when an output (statistics) or error file has to be mailed to administrators. Email as an activity will help in implementing this functionality

    206 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  4. Change Data Capture feature for RDBMS (Oracle, SQL Server, SAP HANA, etc)

    Is it possible to have CDC features on ADF please ?

    196 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    planned  ·  11 comments  ·  Flag idea as inappropriate…  ·  Admin →
  5. Azure Data Factory - Restart an entire pipeline

    Currently in Azure Data Factory, there is no functionality to restart an entire Pipeline. If we need to refresh a dataset in Azure, all associated activities in the pipeline will have to be selected and run separately. Can we have an option where we could run the entire pipeline if required.

    190 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    5 comments  ·  Flag idea as inappropriate…  ·  Admin →
  6. Post-copy script in Copy Activity

    In copy activity there is a feature of pre-copy script. Similarly if there is post-copy script feature it will help to execute code post copy operation is completed from same activity.

    Traditionally when data is being copied from source sql to destination sql, the data is copied incrementally from source to temporary/stage tables/in-memory tables in destination. Post copy the merge code is executed to merge data into target table.

    If post-copy script option is provided in copy activity it will help to call the merge code from copy activity instead of calling another activity like Execute stored procedure.

    183 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    5 comments  ·  Flag idea as inappropriate…  ·  Admin →
  7. Allow parameters for pipeline reference name in ExecutePipeline activity

    ExecutePipeline activity accept constant pipeline name. It should allow to invoke a pipeline dynamically by accepting parameter for pipeline reference name.

    155 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    6 comments  ·  Flag idea as inappropriate…  ·  Admin →
  8. Event Hub

    Source and sink.

    144 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    6 comments  ·  Flag idea as inappropriate…  ·  Admin →
  9. Add PowerShell cmdlet support for import/export ARM Template

    currently there is no means to duplicate the functionality of the import and export ARM template in the adf.azure.com GUI. this is a major gap in the devops / CI/CD story. please add powershell cmdlets that allow import and export of just the ADF factory assets. optioanly provide the ability to include the creation of the data factory in the exported template.

    138 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    4 comments  ·  Flag idea as inappropriate…  ·  Admin →
  10. 125 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    3 comments  ·  Flag idea as inappropriate…  ·  Admin →
  11. ADF connection to Azure Delta Lake

    Are there any plans to provide connection between ADF v2/Managing Data Flow and Azure Delta Lake? It would be great new source and sync for ADF pipeline and Managing Data Flows to provide full ETL/ELT CDC capabilities to simplify complex lambda data warehouse architecture requirements.

    120 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    planned  ·  9 comments  ·  Flag idea as inappropriate…  ·  Admin →
  12. Persist global temporary tables between activities

    It is currently not possible to access a global temporary table created by one activity from a subsequent activity.

    If this was possible you could create a pipeline with a Copy activity chained with a Stored Procedure activity with both accessing the same global temporary table. The benefit of this is that operations against database scoped temporary tables aren't logged, so you can load millions of records in seconds.

    119 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  13. NetSuite connector

    It would be great if there was a NetSuite connector

    116 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  14. Save git branch settings on server side/Factory wide

    The customer need to use specific branch for Data Factory resource but for now, branch setting is saved to cookie as <user>_preference and we have to answer the "Branch selection" dialogue every time the cache was cleared or accessing from the different machine/user.
    Please add a functionality to save this to Factory wide settings to avoid user error.

    107 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  15. Richer variable support

    Allow me to have custom variables at a pipeline and Factory level, which can be refreshed at a specified schedule from a dataset -- the closest analogue for this would be SSIS variables

    One use case for this would be for me to store a set of UTC offsets in a SQL table for each data source, and query this table at pipe runtime to retrieve the correct offset for each source. This offset can then be stored in variables for each pipeline

    107 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    4 comments  ·  Flag idea as inappropriate…  ·  Admin →
  16. Allow Stored Procedure Output Parameters

    In Logic Apps, it is possible to execute a stored procedure that contains an OUTPUT parameter, read the value of that output parameter and pass it to the next activity in the Logic App, this is not possible with ADF v2.

    105 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    4 comments  ·  Flag idea as inappropriate…  ·  Admin →
  17. Pause/Start Azure SQL Data Warehouse from ADF

    Pause/Start Azure SQL Data Warehouse from ADF

    102 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    5 comments  ·  Flag idea as inappropriate…  ·  Admin →
  18. Support encrypted flat files as the source in copy activities

    We use this approach to encrypt sensitive flat files at rest. Please add a feature to ADF to support reading from encrypted flat files in blob storage:
    https://azure.microsoft.com/en-us/documentation/articles/storage-encrypt-decrypt-blobs-key-vault/

    101 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    9 comments  ·  Flag idea as inappropriate…  ·  Admin →
  19. Add the ability to restart an activity from within a pipeline within a master pipeline in ADFv2

    If a pipeline structure is a master pipeline containing child pipelines with the activities held within these, it is not possible to restart the child pipeline and have the parent recognise when the child pipeline completes. Add the functionality to allow an activity in the child pipeline to be restarted that is then passed back to the parent pipeline when successfully completed.

    100 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    4 comments  ·  Flag idea as inappropriate…  ·  Admin →
  20. 96 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    6 comments  ·  Flag idea as inappropriate…  ·  Admin →
  • Don't see your idea?

Data Factory

Categories

Feedback and Knowledge Base