Azure Databricks

Azure Databricks is an Apache Spark-based analytics platform optimized for the Microsoft Azure cloud services platform. Designed with the founders of Apache Spark, Databricks is integrated with Azure to provide one-click setup, streamlined workflows, and an interactive workspace that enables collaboration between data scientists, data engineers, and business analysts.

We would love to hear any feedback you have for Azure Databricks.
For more details about Azure Databricks, try our documentation page.

  1. Databricks Service principal per workspace for specific KeyVault access

    Databricks currently accesses KeyVault from the control plane and uses the same AzureDatabricks Service principal for ALL databricks workspaces in the tennant.

    At present, if you create a secret scope in workspace A on KeyVault A and a new secret scope in workspace B on KeyVault B then the Azure databricks service principal will have access to both keyvaults. Therefore, providing you are privielaged enough to know the details (resource uri) of the keyvaults then you can create a scope from your own databricks workspace C and get access to all the keys!!

    It should be possible to specify an…

    18 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Strong Feedback  ·  Flag idea as inappropriate…  ·  Admin →
  2. ManagedIdentiy(MSI) for databricks

    Should be able to associate managed identity to databricks to interact with other azure resources(ex: keyvault)

    78 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    6 comments  ·  Flag idea as inappropriate…  ·  Admin →
  3. Show the rowcount for rows returned in Notebooks

    The notebook shows the data returned but not the number of rows - otherwise I have to rerun the command with a select count(*) on it.

    This should be easy to implement

    6 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  4. Enable Azure AD credential passthrough to ADLS Gen2

    Add a feature of passing AAD credential of the user working with Azure Databricks cluster to Azure Data Lake Store Gen2 filesystems to build secure and enterprise data lake analytics on top of ADLS Gen2 with Databricks. This feature should not be limited to the high concurrency clusters, since these clusters do not support many features (including Scala), and because a typical advanced analytics scenario for the enterprise is to run a dedicated cluster for a small group of departmental analysts (Standard clusters are the most popular).

    77 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    4 comments  ·  Flag idea as inappropriate…  ·  Admin →
  5. Azure Databricks should have more granular level access permissions

    Currently, Azure Databricks Workspace provides only 4 options for access permissions.


    1. Workspace Access Control

    2. Cluster and Jobs Access Control

    3. Table Access Control

    4. Personal Access Tokens.

    These permissions give more access to user than requirement.

    Would it be possible to create more permissions under Access Control ?

    Specifically for below requirements

    Access to view data sources
    Access to view Databrick runs to check failures and their reasons
    Access to view data changes and deployment issues
    Access to troubleshoot data processing failures caused by Data issues, System errors in Databricks workspace

    17 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Strong Feedback  ·  Flag idea as inappropriate…  ·  Admin →
  6. Azure Databricks should be FedRamp compliant across all US regions

    Azure Databricks seems to have every compliance certification other than FedRamp. I would think it would not be an issue to become FedRamp compliant, which would allow government data to be transformed in the platform.

    4 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  7. Azure Diagnosticks logs are collected with up to 24 hour delay, alert cannot be used

    As the doc says :
    On any given day, Azure Databricks delivers at least 99% of diagnostic logs within the first 24 hours, and the remaining 1% in no more than 72 hours.
    Refer : https://docs.microsoft.com/en-us/azure/databricks/administration-guide/account-settings/azure-diagnostic-logs#diagnostic-log-delivery

    In this case, if logs are sent to log analytcis, log search alert can not be used to monitior those logs due to the unpredictable delay . This has been posted by multiple customers, hope this can be enhanced

    4 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Strong Feedback  ·  Flag idea as inappropriate…  ·  Admin →
  8. Expose API key during ARM deployment

    We have a CD/CI pipeline set up for our analytics platform deployment (Datalake, ADF, Databricks, ...). Some of the settings are written directly to a KeyVault so they can be referenced later on by e.g. ADF Linked Services.
    However, for Azure Databricks there is always a manual step: A user needs to log in an create an API key in the UI.
    It would be great if the ARM template could return a temporary Databricks API key (e.g. valid only for 24h) which would allow us to automate everything (e.g. content deployment, cluster creation, ...) via the Databricks API

    158 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    14 comments  ·  Flag idea as inappropriate…  ·  Admin →
  9. High availability for driver nodes

    Currently Azure Databricks clusters contain a single driver node, which creates a single point of failure should a process fail. This can causes clusters to become unresponsive during jobs -- affecting streaming jobs greatly.

    I would propose a second driver node be made available (when desired) to support automatic failover (HA) should a driver becoming unresponsive.

    7 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  10. Implement access token auto refresh when using credential passthrough

    When a cluster is configured with credential passthrough we are getting an access denied error after 1 hour of running a notebook due to the AD access token expiration. Because of that, it would be nice to have the access token auto refresh feature, with no need to an Azure Active Directory admin increase the AccessTokenLifetime for users.

    This feature is also cited in a comment here: https://feedback.azure.com/forums/909463-azure-databricks/suggestions/36879865-enable-azure-ad-credential-passthrough-to-adls-gen

    2 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Strong Feedback  ·  Flag idea as inappropriate…  ·  Admin →
  11. Please add feature that can use Table access control when use "R"

    At now, we can use Table access control with python and SQL only .
    So, please add feature that can use Table access control when use "R".

    4 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  12. Timestamp for each command

    It would be very helpful to see the exact timestamps for when a command started and finished processing, not only the runtime in msec.

    2 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Strong Feedback  ·  Flag idea as inappropriate…  ·  Admin →
  13. Launching Databricks WorkSpace from Azure Portal

    In order to launch the databricks workspace, the user needs to be an owner /contributor at the databricks resource level in azure portal, which is annoying for any enterprise users who are planning to roll out to larger audiences.

    Providing the direct workspace backend URL to the end user manually is not the ideal way , Since there are few now and will be 100's in the future.

    Permissions are set at the workspace and cluster level, When a user launches the workspace from the azure portal , whatever the api that is calling the databricks should validate the existing…

    9 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    1 comment  ·  Strong Feedback  ·  Flag idea as inappropriate…  ·  Admin →
  14. Allow to categorize Azure Databricks costs by cluster name

    Allow to see how much each cluster are spending, so we can manage better the costs relative to certain activities

    3 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  15. Tags should be inherited from the deployment resource group

    We've been trying to utilize tags in our environment to identify resources for billing purposes. We've put a policy in place that forces new resources to inherit this tag from its resource group if it wasn't given one explicitly. Most Azure resources are handled fine, but we've discovered that the tags don't get inherited by the resources created by databricks in the managed resource group. Instead, we have to individually assign the tags to each Databricks cluster. I'd like to see new clusters that inherit the tags of either the Managed Resource Group or the Databricks service itself.

    1 vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  16. job creation automation

    Is there any way to create ADB job in automated way. Currently I have to create jobs manually in all the environments.

    1 vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  17. STOP the non-sense of making Resource Groups for these services if you really want us to use them!! Completely annoying.

    Totally insane. Databricks is the WORST offender of this, but Network Watcher does it as well. I won't allow RGs to be created unless they are NAMED and TAGGED according to OUR rules, so people cannot use this service. Period.

    13 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    2 comments  ·  Strong Feedback  ·  Flag idea as inappropriate…  ·  Admin →

    Thanks for the valid suggestion. Your feedback is now open for the user community to upvote & comment on. This allows us to effectively prioritize your request against our existing feature backlog and also gives us insight into the potential impact of implementing the suggested feature.

  18. Can we access Event hub in Databricks using service principal

    Event hub document shows that we can access it using service principal and also I see that, we can access event hub using service principal using python libraries.

    Can we access Event hub in Databricks using service principal instead of SAS

    1 vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  19. Security level when connecting Databricks to other Azure services

    Hello

    the [following page](https://docs.microsoft.com/fr-fr/azure/databricks/administration-guide/cloud-configurations/azure/vnet-inject) states :

    "VNet injection, enabling you to:
    Connect Azure Databricks to other Azure services (such as Azure Storage) in a more secure manner using service endpoints."

    Data Factory is now a 'Trusted Service' in Azure Storage and Azure Key Vault firewall, we can connect to those services as ‘Trusted Service’ using the Data Factory managed identity and the firewall settings ‘Allow trusted Microsoft Services…’.

    Could you please explain why using "service endpoints is more secure" that using 'Trusted Service'?

    Reference : [Data Factory is now a 'Trusted Service' in Azure Storage and Azure Key…

    1 vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  20. enahnce Databricks for CI/CD with ARM support for Databricks PAT from an AAD identity and linking KeyVault as a Databricks secret scope.

    ARM templates only create the Databricks workspace. Adding support for AAD identities obtaining a Databricks PAT and linking to KeyVault would really help with cluster deployment.

    At Build there was an announcement that scripts would soon be included in ARM templates so updating the Databricks API to support these actions would probably allow this.

    10 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
← Previous 1 3 4 5
  • Don't see your idea?

Azure Databricks

Categories

Feedback and Knowledge Base