Data Factory

Azure Data Factory allows you to manage the production of trusted information by offering an easy way to create, orchestrate, and monitor data pipelines over the Hadoop ecosystem using structured, semi-structures and unstructured data sources. You can connect to your on-premises SQL Server, Azure database, tables or blobs and create data pipelines that will process the data with Hive and Pig scripting, or custom C# processing. The service offers a holistic monitoring and management experience over these pipelines, including a view of their data production and data lineage down to the source systems. The outcome of Data Factory is the transformation of raw data assets into trusted information that can be shared broadly with BI and analytics tools.

Do you have an idea, suggestion or feedback based on your experience with Azure Data Factory? We’d love to hear your thoughts.

  1. HITRUST Compliance with Azure Data factory

    In Azure Compliance offering sheet, I see Data factory is not compliance with HITRUST. Is there a roadmap to support it?

    498 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    8 comments  ·  Flag idea as inappropriate…  ·  Admin →
  2. 97 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    3 comments  ·  Flag idea as inappropriate…  ·  Admin →
  3. Fault Tolerance - Skip Incompatible Columns

    I am loading data dynamically from multiple SQL Server databases. All of the databases share the same table structure. However, there may be some exceptions where one of the databases has a missing column in one of the tables.

    When such a situation occurs, I would like to set the Fault Tolerance on to Skip the incompatible column. Meaning, instead of it failing or skipping all the rows (columns), it should skip the single column instead.

    That way, instead of losing all the columns, I lose one column, which is not used anyways since it never exists.

    96 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  4. Add ability to customize output fields from Execute Pipeline Activity

    This request comes directly from a StackOverflow post, https://stackoverflow.com/questions/57749509/how-to-get-custom-output-from-an-executed-pipeline .
    Currently, the output from the execute pipeline activity is limited to the pipeline's name and runId of the executed pipeline, making it difficult to pass any data or settings from the executed pipeline back to the parent pipeline - for instance, if a variable is set in the child pipeline, there is no in-built way to pass this variable in the Azure Data Factory UI. There exists a couple of workarounds as detailed in the above StackOverflow post, but adding this as an inbuilt feature would greatly enhance the ability…

    379 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    7 comments  ·  Flag idea as inappropriate…  ·  Admin →
  5. Dark theme for Data Factory Web UI

    Dark theme for Azure Data Factory Web UI, I think it would be a nice addition for those of us who prefer dark themes in general. Also, it would be consistent with the Azure portal.

    106 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  6. Please allow users to automate “Publish”.

    Now, somebody must get to the dev ADF, open the master branch, and hit “Publish” for the changes to get the “Adf_publish” updated.

    Automating “Publish” on the master branch is necessary for improving efficiency and saving more time.

    35 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  7. ADF connection to Azure Delta Lake

    Are there any plans to provide connection between ADF v2/Managing Data Flow and Azure Delta Lake? It would be great new source and sync for ADF pipeline and Managing Data Flows to provide full ETL/ELT CDC capabilities to simplify complex lambda data warehouse architecture requirements.

    137 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    10 comments  ·  Flag idea as inappropriate…  ·  Admin →
  8. ADF to kill the queued jobs

    When something is stuck, we want to start the pipelines freshly. It would be helpful to have an option to kill the already queued up jobs and have the capability to start the pipelines from scratch again. Currently we don’t have an option other than to just wait for the queued-up jobs to get completed.

    23 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  9. MSI auth for Azure Files connector

    Please consider adding MSI authentication support for Azure Files connector.
    Currently, we only support storage account key and this makes impossible for Azure IR to connect Azure Files if it's on the same region.
    For firewall enabled storage account on the same region as Azure IR, we need to use "Allow trusted Microsoft service" option on firewall.

    25 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  10. ForEach activity - Allow break

    Allow break a ForEach activity like ForEach works in most languages. Currently ForEach will iterate all items to end, even if we don't want it.

    If I have an error in one of the items, I may want to break ForEach, stop iterating and throw that error.

    For now, I have to use a flag variable and IF's to avoid ForEach to continue calling all the activities.

    60 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  11. Save git branch settings on server side/Factory wide

    The customer need to use specific branch for Data Factory resource but for now, branch setting is saved to cookie as <user>_preference and we have to answer the "Branch selection" dialogue every time the cache was cleared or accessing from the different machine/user.
    Please add a functionality to save this to Factory wide settings to avoid user error.

    107 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  12. Allow Data Factory Managed identity to run Databricks notebooks

    Integrate Azure Data Factory Managed Identity in Databricks service.. like you did for Keyvault, storage, etc.

    68 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    planned  ·  3 comments  ·  Flag idea as inappropriate…  ·  Admin →
  13. Data factory should be able to use VNet without resorting to self hosted

    Self hosted makes a lot of sense when integrating on-premise data, however it's a shame to need to maintain a self-hosted integration runtime VM when wishing to leverage the extra security of a VNet i.e. firewalled storage accounts etc.

    Ideally the azure managed integration runtimes would be able to join a vnet on demand.

    649 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    17 comments  ·  Flag idea as inappropriate…  ·  Admin →

    We are very excited to announce the public preview of Azure Data Factory Managed Virtual Network.

    With this new feature, you can provision the Azure Integration Runtime in Managed Virtual Network and leverage Private Endpoints to securely connect to supported data stores. Your data traffic between Azure Data Factory Managed Virtual Network and data stores goes through Azure Private Link which provides secured connectivity and eliminate your data exposure to the public internet. With the Managed Virtual Network along with Private Endpoints, you can also offload the burden of managing virtual network to Azure Data Factory and protect against the data exfiltration.

    To learn more about Azure Data Factory Managed Virtual Network, see https://azure.microsoft.com/blog/azure-data-factory-managed-virtual-network/

  14. Data Flow static IP address ranges and "allow trusted Microsoft services"

    As far as I know, IP address of Data Flow cannot be specified and also Data Flow isn't included trusted Microsoft services.
    To enhance security on Azure SQL DB, Azure Storage etc.. Please consider adding features.

    24 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    started  ·  1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  15. Please improve to be able to increase timeout more than 30 seconds for Azure Table Storage

    When I copy data from Azure table Storage, sometimes Timeout occurred at Azure Table Storage side.
    But at now, the maximum timeout for Azure Table Storage is 30 seconds.
    So, Please improve to be able to increase timeout more than 30 seconds for Azure Table Storage

    29 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  16. Make it possible for running a wrangling data flow with custom Azure Integration Runtime

    As far as I know, wrangling data flow activity does not support azure integration runtime with time to live. So as I want to configure TTL, I have to specify wrangling data flow particularly run from the AutoResolveIntegrationRuntime and the rest data flow from the configured Azure IR.
    But it's better to run all data flow on one Azure IR, so we could save the time needed to warm up a cluster. To make it possible please consider adding features.

    16 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  17. Please Improve ADF to be able to handle tabular column names with dot character as a just name. not path separator.

    Please Improve ADF to be able to handle tabular column names with dot character as a just name. not path separator.

    At now , When import csv file to cosmos DB with using ADF,ADF can't handle tabular column names with dot character, always take dot as path separator.
    So, Please Improve ADF to be able to handle tabular column names with dot character as a just name. not path separator.

    [Example CSV file columns name]

    'DataA.region','DataA.NumberA','DataA.NumberB','pk'

     

    [JSON Data that I want to import in cosmos db]

        'DataA.region': 'japaneast',
        'DataA.NumberA': '',
        'DataA.NumberB': '',

        'pk': '1',

     

    [JSON Data that actual

    17 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  18. Publish Azure Data Factory ARM templates to a custom folder in the publish branch

    Provide the ability to publish Azure Data Factory ARM templates to a custom folder in the publish branch. An additional property could be added to the publish_config.json file in order to cater for this e.g.

    {
    "publishBranch":"release/adf_publish",
    "publishFolder":"Deployment/ARM/"
    }

    https://docs.microsoft.com/en-us/azure/data-factory/source-control#configure-publishing-settings

    50 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    2 comments  ·  Flag idea as inappropriate…  ·  Admin →
  19. Support for Dynamic Content with pagination rule in Rest source while Copy Data activity

    I'm using pagination rule in copy data activity from rest endpoint to blob storage.

    I have applied pagination rule with dynamic value like this:

    AbsoluteUrl = @replace('$.nextLink','oldbaseurl','newbaseurl')

    But it is still accessing the oldbaseurl which is I'm getting in response, doesn't replace it with new through function.

    Error in data factory:

    ErrorCode=UserErrorFailToReadFromRestResource,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=An error occurred while sending the request.,Source=Microsoft.DataTransfer.ClientLibrary,''Type=System.Net.Http.HttpRequestException,Message=An error occurred while sending the request.,Source=mscorlib,''Type=System.Net.WebException,Message=Unable to connect to the remote server,Source=System,''Type=System.Net.Sockets.SocketException,Message=A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has…

    18 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  20. A role to be prohibited to preview data

    Please consider adding a role like below:
    the user is
    - allowed to change pipeline/dataset/linkedservice and other ADF objects
    - allowed to retrieve schema information from datasource like SQL DB
    - prohibited to retrieve data from datasource with [preview data] or some other methods

    An easy workaround to achieve this is to use Managed ID auth or other non-key based auth and delete SQL permission during the operation.
    But it's better to allow such permission in bulk to the operators.

    29 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
← Previous 1 3 4 5 46 47
  • Don't see your idea?

Data Factory

Categories

Feedback and Knowledge Base