Data Factory

Azure Data Factory allows you to manage the production of trusted information by offering an easy way to create, orchestrate, and monitor data pipelines over the Hadoop ecosystem using structured, semi-structures and unstructured data sources. You can connect to your on-premises SQL Server, Azure database, tables or blobs and create data pipelines that will process the data with Hive and Pig scripting, or custom C# processing. The service offers a holistic monitoring and management experience over these pipelines, including a view of their data production and data lineage down to the source systems. The outcome of Data Factory is the transformation of raw data assets into trusted information that can be shared broadly with BI and analytics tools.

Do you have an idea, suggestion or feedback based on your experience with Azure Data Factory? We’d love to hear your thoughts.

  1. Copy Activity "dirty read" on Table selection

    In the Copy Activity, I can select, Table, Query or Stored Procedure.
    When I select table, I need an option to do a "dirty read". A simple check box would do the trick on the Settings tab.

    I can enter SQL and add this, but then it removes the need for the Table option.

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  2. Iteration and Conditional Activities

    The current release requires that you add activities under the If Condition or ForEach activity. If you modify an existing pipeline to add an If Condition activity, you’ll have to cut and paste all existing activities after the If Condition activity. For example, if you add an If Condition after activity A, you have to copy and paste all activities after activity A, if they are still valid and relevant after the addition, and paste them to the If Condition activity. That’s a lot of extra work. Plus, these activities are not visible on the main screen.

    It would be…

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  3. Error handler

    Sometimes not every error need to fail the whole pipeline. there might be a scenario where you want to pass even an activity fails.

    Lets take "Until" Activity i am checking for file existence if the file is not available till the time provided it should exit but not fail the pipeline because Until activity is exactly doing what is modeled to do check till that time. If i want i will schedule it for every 2 hours to check the file.

    In that scenario i should be able to hande the timeout error as a pass scenario.

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  4. Support SQL query in dataset configured to read data from a Cloud-hosted D365 instance

    Permit a dataset using a Dynamics connection to be configured to read data from a Cloud-hosted Dynamics 365 instance by specifying an SQL query instead of an entity name or a FetchXML query. This would get over the limitations of FetchXML and permit just the required data to be collected.

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  5. Need activities to delete the files from either ADLS or Azure Storace Account

    Need activities to delete the files from either ADLS or Azure Storace Account

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  6. 1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  7. Allow me to save the layout of a pipeline monitor window so that refreshing does not lose all the selections I have just made

    Allow me to save the layout of a pipeline monitor window so that refreshing does not lose all the selections I have just made

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  8. Add option to "Rerun from failed activity" in bulk

    ADFv2 has the option to rerun a given pipeline run from the activity it failed on, which is great.
    The problem is that, if we have several dozen failed pipeline runs (e.g. after recovering from an outage), there's no way to send them all to rerun from failed activity.
    The "Pipeline runs" view lets you select multiple pipeline runs and rerun them all - but that's a full rerun, without skipping succeeded activities.
    Please add the option to "rerun from failed activity" in this bulk selection mode!

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  9. Allow publishing of branch

    Right now publishing is allowed only in master branch. If i want to test a schedule i need to publish to master which seems in appropriate.

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  10. Copy multiple specific shared files

    In some cases, I just know the names of the files I need to transfer so it would be useful to be able to specify a list of file names (relative paths) to be copied.

    In these cases, using a File System dataset, the shared folder will not be enumerated, instead it will return just the specific files that I requested.

    It is like being able to provide a list of file paths instead of a single file, see attached picture.

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  11. Fix the support for Excel (Large Files > 45MB).

    Pipeline(run) crashes when reading Excel files larger than 45MB.

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  12. 1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  13. visual studio code

    Support author and monitor functions of Azure Data Factory using Visual Studio Code

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  14. Add detailed tutorials on Data Factory

    We have requirement to copy data from one primary db to secondary DB.We don't want to copy whole Db instead we need data of before last
    6 months, and we want to keep 6 months data on primary DB.
    I am trying to do this using Data factory copy activity but not able to do.
    Please provide more about how can we filter data, write custom query to copy data , pass paramters to custom query .

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  15. Configure Event Based Triggers Between Environments

    Event based triggers are amazing to handle dependencies. However, we should be able to switch between subscriptions and storage accounts by accessing through Key Vault which are configured to switch between environments

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  16. Pre-Copy Script in CopyData activity should be able to use SQL file along with hard coded SQL script.

    I am not too sure but it looks to me that the Pre-Copy Script in CopyData activity doesn't support using SQL files. We have to put hard-coded SQLs there.
    If that is correct assumption then it should have ability to read from SQL files as well.

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  17. 1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  18. Data Factory activity output deciding the "color" of the entire pipeline status

    In Azure Data Factory pipeline, you can have only 2 status: OK (Green) or Failed (Red).

    Would it be possible having a third status, like Partially Succeeded (Orange) based on some custom outputs.

    The idea comes from this discussion: https://github.com/MicrosoftDocs/azure-docs/issues/31281

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  19. Make a wizard similar to the new Copy (Preview) which just allows you to type any long running SQL statement

    Need a way of just entering an adhoc SQL statement which you know will run for hours/days and have it managed by ADF.

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  20. Azure Table Storage key field translations, JSON element naming, and rollover text

    I think it was a mistake to have a separate system for setting PartitionKey and RowKey values in the sink section, when you had a system setup in the translator area specifically for the purpose of translating fields (columns) when reading and writing entities. When I first tried to get a PartitionKey to come from another field in my source table (NewPKey in this example), I incorrectly tried to do it like this ...

    "translator": {
    "type": "TabularTranslator",
    "columnMappings": "NewPKey: PartitionKey, RowKey: RowKey, Value: Value"
    }
    ... of course, that didn't work. For logical consistency and simplicity, I wish that…

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  • Don't see your idea?

Data Factory

Categories

Feedback and Knowledge Base