Data Factory

Azure Data Factory allows you to manage the production of trusted information by offering an easy way to create, orchestrate, and monitor data pipelines over the Hadoop ecosystem using structured, semi-structures and unstructured data sources. You can connect to your on-premises SQL Server, Azure database, tables or blobs and create data pipelines that will process the data with Hive and Pig scripting, or custom C# processing. The service offers a holistic monitoring and management experience over these pipelines, including a view of their data production and data lineage down to the source systems. The outcome of Data Factory is the transformation of raw data assets into trusted information that can be shared broadly with BI and analytics tools.

Do you have an idea, suggestion or feedback based on your experience with Azure Data Factory? We’d love to hear your thoughts.

  1. Ability to Disable an Activity

    Please allow setting a certain activity to enable or disabled, pretty much like you can do in SSIS.

    This is important when you are developing and only want to execute a certain part of the pipeline for example

    675 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    under review  ·  27 comments  ·  Flag idea as inappropriate…  ·  Admin →
  2. Azure Data Factory - Google Analytics Connector

    Some customers have the necessity to extract information from Google Analytics in order to create a data lake or sql dw to gather marketing insights mixing another kind of data.

    Now we have some custom SSIS packages that are paid or developing some custom code.

    Or if it is not possible in Azure Data Factory, could have anoter way to extract this data in native connector in Azure … maybe Logic Apps

    408 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    13 comments  ·  Flag idea as inappropriate…  ·  Admin →
  3. Bitbucket Integration

    We need to use bitbucket for a project. We are mirroring our azure devops repo with the pipelines to bitbucket. It would be easier if there was integration with bitbucket.

    361 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    9 comments  ·  Flag idea as inappropriate…  ·  Admin →
  4. Allow static value columns additional to columns available in source files

    We have a requirement to delete the existing data in the SQL Azure based on some criteria. Since we dont have a way of assigning any global variable/parameter and passing this value across activities.

    We have different folders to pick up data from. Both folders will never have files at the same time. The data flow and transformation of data is same but for the same kind of work, we need to execute separate data flows (multiple datasets and pipelines/activities).

    How about allowing to define a static value for a column in Dataset/Pipeline.
    Example:

    Folder 1 data flow -> if
    302 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    under review  ·  14 comments  ·  Flag idea as inappropriate…  ·  Admin →
  5. Need of Execute SQL Task in Azure Data Factory v2

    We only have a execute stored procedure in ADFv2. But most of the time we don't want to create stored procedure for all of the primary ETL tasks, such as counting the no. of records from a table, Updating data into tables, creating tables, etc. There are many such activities need T-SQL Execution. It would be great if we have Execution SQL option.

    ADFv2 have the option to use variety of RDBMS source and sink systems such as MySQL, Oracle, etc., . ESQL would be the powerful task to have in Azure Data Factory V2 to be used in all…

    296 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    5 comments  ·  Flag idea as inappropriate…  ·  Admin →
  6. Allow choosing logical AND or logical OR in activity dependencies

    We have activity dependencies today, but they are always logical AND. If we have Activity1 -> Activity 2 -> Activity3 and we want to say if any of these activities fail, run activity 4, it isn't straight forward. In SSIS, we can choose an expression and choose whether we need one or all conditions to be true when there are multiple constraints. We need similar functionality here. It can be achieved with a bit of creativity (repeat the failure activity as the single failure path after each of the original activities use the If Condition to write logic that would…

    291 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    under review  ·  0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  7. Add a new email activity with the ability to send attachments as part of the workflow.

    There are numerous instances when an output (statistics) or error file has to be mailed to administrators. Email as an activity will help in implementing this functionality

    173 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  8. Event Hub

    Source and sink.

    135 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    6 comments  ·  Flag idea as inappropriate…  ·  Admin →
  9. Pause/Start Azure SQL Data Warehouse from ADF

    Pause/Start Azure SQL Data Warehouse from ADF

    96 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    5 comments  ·  Flag idea as inappropriate…  ·  Admin →
  10. Support encrypted flat files as the source in copy activities

    We use this approach to encrypt sensitive flat files at rest. Please add a feature to ADF to support reading from encrypted flat files in blob storage:
    https://azure.microsoft.com/en-us/documentation/articles/storage-encrypt-decrypt-blobs-key-vault/

    85 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    8 comments  ·  Flag idea as inappropriate…  ·  Admin →
  11. Support Azure app service API

    Can it consume or push data to Azure app service API? Supporting Swagger API.

    78 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  12. 52 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    5 comments  ·  Flag idea as inappropriate…  ·  Admin →
  13. Data Catalog integration

    If Data Catalog (ADC) is to be our metadata store for helping users explore data sets it occurs to me that there out to be some sort of integration with ADF so that new data sets appear automatically and their refresh status is available so that end users know data is up to date etc. I also notice there is no specific feedback category for ADC.
    Also ADF should be able to consume data sets in ADC by populating the appropriate linked services and table

    38 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  14. Add datasets to Data Flows that can support all the connectors found in Azure Data Factory

    I have a use case where the source is SAP and the sink is ODBC based and I don't want to rest the data into any intermediate staging storage. Just let it flow through. To date, Data Flows support only 5 types of datasets when defining a source or a sink (Azure Blob Storage, Azure Data Lake Storage Gen1, Azure Data Lake Storage Gen2, Azure SQL Data Warehouse, and Azure SQL Database)

    28 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    under review  ·  4 comments  ·  Flag idea as inappropriate…  ·  Admin →
  15. Richer alerting

    Currently, data factory has alerts for failed and succeeded runs only.

    There are multiple other conditions that need action, so the user should be alerted:
    - Timed out runs
    - Data gateway updates required
    - Linked service credentials expired/expiring soon

    Can alerts for these conditions be added?

    24 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    under review  ·  2 comments  ·  Flag idea as inappropriate…  ·  Admin →
  16. Provide Parallelism in Copying data from Hive to SQL Server IAAS

    Need your expert input to improve the performance of copy activity from our Hive Table to SQL Server using ADF Pipeline.Currently copy activity happening in Single threaded mode, it is taking 150 Mins to copy 20 Gb of data.this 20Gb data has been splitted into multiple files internally by hive, we see it hold 51 files , is there way in ADF to parallel load these files into SQL Server.
    Note: Internal hive Splited files are not managed by us, it is generated automatically by Hive. File naming convention is not known inside hive table(folder) in blob.

    20 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    under review  ·  1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  17. Add an option to specify ScriptPath in SqlSource for Copy Activity

    When we have SqlSource for copy activity, there should be an option to specify scriptpath. If the SQL query is very big, it is very difficult to put the whole content in a single line.

    Below is the existing support. We should have support for scriptpath inside source key.

    "type": "Copy",

                "typeProperties": {
    
    "source": {
    "type": "SqlSource",
    "sqlReaderQuery": "SELECT TOP 300 * FROM dbo.Employee"
    }

    17 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  18. Add some type of data view

    It would be great to get able to see the data coming from the data sources or after some transformation.

    16 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    under review  ·  0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  19. Copy Wizard should use Polybase to export from SQL DW to blob

    Currently it appears that the ADF copy wizard does not use Polybase in SQL DW (CREATE EXTERNAL TABLE AS SELECT...) in order to export the contents of a table into blob storage. As this will be much faster, please support this. Also, if you're using the copy wizard to copy from DW to DW, please Polybase out and Polybase in.

    The same should apply to SQL DW to Azure Data Lake Store as Polybase is now supported to do that.

    13 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    2 comments  ·  Flag idea as inappropriate…  ·  Admin →
  20. Data slice function for dataset

    Allow a function to be defined for a dataset, something like a WHERE clause. The Oracle on prem example is filtered by date but it's in the pipeline and it's text manipulation of a SQL statement which seems prone to error.

    11 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
← Previous 1
  • Don't see your idea?

Data Factory

Categories

Feedback and Knowledge Base