Data Factory

Azure Data Factory allows you to manage the production of trusted information by offering an easy way to create, orchestrate, and monitor data pipelines over the Hadoop ecosystem using structured, semi-structures and unstructured data sources. You can connect to your on-premises SQL Server, Azure database, tables or blobs and create data pipelines that will process the data with Hive and Pig scripting, or custom C# processing. The service offers a holistic monitoring and management experience over these pipelines, including a view of their data production and data lineage down to the source systems. The outcome of Data Factory is the transformation of raw data assets into trusted information that can be shared broadly with BI and analytics tools.

Do you have an idea, suggestion or feedback based on your experience with Azure Data Factory? We’d love to hear your thoughts.

  1. Pass dynamic path to event based trigger

    We have a scenario where we require our pipeline to get triggered when a new file is created every day.
    Eg:
    sample -1
    /2019/03/test20190301.ss
    sample-2
    /2019/03/test
    20190302.ss

    we need some trigger to detect sample 1 and sample 2 dynamically.
    Like,
    Trigger should take the FilePath and Filename(which contains event date) on the fly.

    Path should be like: /yyyy/MM/testyyyyMM_dd.ss where the values should be determined by the trigger from the event date

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  2. Use same filter for many datasourcres in data flow

    It should be possible to apply the same filter to several sources in a dataflow. For example, if I have three sources that all have a last updated field, I might want to say that for all these sources, I only want rows that are less than a year old. They should be able to go through the same filter and branch out into their respective flows again afterward. It becomes very repetitive and error-prone to have to add the same filter many times.

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  3. Allow reusable components

    It should be possible to create reusable components. There are two cases for this:

    1) We use a specialized id generation based on row fields. The actual fields used and the number of fields used is dependent on the data, so it must be customizable

    2) We have some fields that we would like to add to all our tables. It would be nice to able to define some global derived columns that we could add - these should be parameterizable so that they can be combined with 1)

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  4. Make it possible to switch settings for integration runtime

    It should be possible to change settings for integration runtimes - e.g. the number of cores. It's very frustrating to have to create a new one and assign this to all dataflows. The best workaround, I've found, is to do the reassignment by search/replace in visual studio, but this seems less than ideal.

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  5. file extension its empty

    I'm generating a csv file with compression after decompress there is no file extension its empty

    Actual results when decompress : abc.gzip --> abc
    My expectation : abc.gzip --> abc.csv

    3 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  6. DF - event trigger on update or insert within Azure SQL database

    It would be very handy to have event-based trigger for a Data Factory pipeline whenever an action (insert, update) within Azure SQL database is discovered.

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  7. typo in ADF: "paranthesis"

    the dynamic content editor shows this error:

    "Syntax error: Missing enclosing paranthesis"

    parAnthesis instead of parenthesis

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  8. Parentheses Highlighting in Dynamic Expression

    Dynamic Expressions are pretty useful in Data Factory to perform some complex logic by nesting functions.

    It gets difficult to identify the closing parenthesis.

    It'll be very helpful if closing parenthesis gets highlighted when the text cursor is at the opening of a parenthesis

    3 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  9. Allow string_agg in data flow aggregations

    Currently, it's only possible to do numerical aggregations (count, sum, etc) in data flow aggregations. Implementing something that works like SQL string_agg would be very helpful.

    11 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    started  ·  1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  10. Provide Query parameter support in Data Sources

    When using a query in a data set support for ado @ParamName and ODBC style ? parameters should be supported. for example SELECT * FROM dbo.Customers WHERE Mod_Date = ?. parameter support should be similar to what stored procedure offers with the ability to dynanicly detect the parameters based on the query. while you can use dynamic sql created within ADF this aproach is cumersome and leaves you open to sql injection attacks

    7 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  11. Copy Data Activity - Show data read / data written by pipeline

    Hi,

    In "Pipeline runs" there is the amount of data read / written per table. Is it possible to show this information by pipeline? That way we can know the total values.

    Best regards,
    Cristina

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  12. rename

    Being able to customize file name on destination.
    and being able to customize filename with dynamic parameters and functions

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  13. Execute Pipeline activity automatic rerun

    Possibility to automatically rerun the related pipeline when a failure occurs.

    This is to help cases where a single activity rerun will not get the pipeline on track, for example, when data must be submitted again from the beginning. In these cases, it might be necessary to rerun the complete pipeline.

    As of today, the Execute Pipeline activity does not have possibility to specify the number of retries that can be executed before the activity is set to failed.

    The workaround to implement a solution involves several components and seems unnecessarily complex.

    The attached picture describes a linear pipeline including…

    3 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  14. HITRUST Compliance with Azure Data factory

    In Azure Compliance offering sheet, I see Data factory is not compliance with HITRUST. Is there a roadmap to support it?

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  15. windowStart and windowEnd time fields should be available as display columns in Monitor section

    Currently the Pipeline runs and Trigger runs pages have Run Start and Trigger time columns for display, sorting and filtering. windowStart and windowEnd times for the runs should be available as an option or better in the default view.

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  16. Data Factory - Event Trigger (storage) Include advanced filter options

    After creating an event trigger, we can edit the event on the storage account.
    Can we have the ability to add advanced filters through Data Factory as any changes on the event are over written when there is a publish on the trigger.

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  17. Add Postgres Database as Sink

    Data Factory supports Azure Postgres as Sink. It would be great if we had Postgres Database as Sink

    3 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  18. duplicate row removal in Union transformation in Azure Data Flow

    Typically if we use Union operation in SQL, it removes duplicate row.

    But the equivalent transformation named as Union in the Azure Data Flow does not remove duplicate row.

    Ideally there should be option to do so.

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  19. Add support for Copy Data from File store to move processed files to a specified folder

    The Copy Data activity is missing a basic feature available in most other data integration tools when processing files.

    When a file is processed within a file store, there should be the option to move it to another "processed" folder, with the destination folder configurable.

    When a file is not processed successfully, there should be the option to move it to another "error" folder, with the destination folder configurable. It should also offer the option to continue processing the rest of the files if one file is not processed successfully.

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  20. Allow Data Factory Managed identity to run Databricks notebooks

    Integrate Azure Data Factory Managed Identity in Databricks service.. like you did for Keyvault, storage, etc.

    4 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
← Previous 1 3 4 5 37 38
  • Don't see your idea?

Data Factory

Categories

Feedback and Knowledge Base