Azure Databricks

Azure Databricks is an Apache Spark-based analytics platform optimized for the Microsoft Azure cloud services platform. Designed with the founders of Apache Spark, Databricks is integrated with Azure to provide one-click setup, streamlined workflows, and an interactive workspace that enables collaboration between data scientists, data engineers, and business analysts.

We would love to hear any feedback you have for Azure Databricks.
For more details about Azure Databricks, try our documentation page.

  1. Jobs/Interactive Tasks Prioritization

    By default, when a new task comes in, it uses all of the available resources in Databricks and other tasks might need to wait. As this is Big Data and shared environment, most of the times, I think it will be beneficial to be able to assign priorities at Job Level or Task level (when working Interactively in a Notebook). This way, you will instruct Databricks that if a task with higher priority comes in, it has to yield some the currently allocated resources and grant the required memory/cpu to the task with higher priority. If many users are working…

    8 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  2. Databricks Service principal per workspace for specific KeyVault access

    Databricks currently accesses KeyVault from the control plane and uses the same AzureDatabricks Service principal for ALL databricks workspaces in the tennant.

    At present, if you create a secret scope in workspace A on KeyVault A and a new secret scope in workspace B on KeyVault B then the Azure databricks service principal will have access to both keyvaults. Therefore, providing you are privielaged enough to know the details (resource uri) of the keyvaults then you can create a scope from your own databricks workspace C and get access to all the keys!!

    It should be possible to specify an…

    26 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    2 comments  ·  Strong Feedback  ·  Flag idea as inappropriate…  ·  Admin →
  3. Log automated deletion of unpinned clusters

    If unpinned clusters are deleted by databricks, there are no logs to be found anywhere in Log Analytics. For auditing and monitoring purposes i think this would be needed.

    Sometimes people forget to pin clusters and then you at least want to know from the logs what happened to it and when it got deleted.

    6 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  4. Implement access token auto refresh when using credential passthrough

    When a cluster is configured with credential passthrough we are getting an access denied error after 1 hour of running a notebook due to the AD access token expiration. Because of that, it would be nice to have the access token auto refresh feature, with no need to an Azure Active Directory admin increase the AccessTokenLifetime for users.

    This feature is also cited in a comment here: https://feedback.azure.com/forums/909463-azure-databricks/suggestions/36879865-enable-azure-ad-credential-passthrough-to-adls-gen

    11 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    1 comment  ·  Strong Feedback  ·  Flag idea as inappropriate…  ·  Admin →
  5. Azure Diagnosticks logs are collected with up to 24 hour delay, alert cannot be used

    As the doc says :
    On any given day, Azure Databricks delivers at least 99% of diagnostic logs within the first 24 hours, and the remaining 1% in no more than 72 hours.
    Refer : https://docs.microsoft.com/en-us/azure/databricks/administration-guide/account-settings/azure-diagnostic-logs#diagnostic-log-delivery

    In this case, if logs are sent to log analytcis, log search alert can not be used to monitior those logs due to the unpredictable delay . This has been posted by multiple customers, hope this can be enhanced

    11 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    2 comments  ·  Strong Feedback  ·  Flag idea as inappropriate…  ·  Admin →
  6. 8 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Strong Feedback  ·  Flag idea as inappropriate…  ·  Admin →
  7. diff over multiple versions in the "Revision history" and/or export diffs

    The "Revision-History" is very unhandy when seraching for changes.
    a) The view renders very slowly - even worse for bigger notebooks
    b) The diffs have to be searched by scrolling though the whole notebook looking for red an green areas
    c) Worst of all: I can only compare one revision to the very one before
    There is an urgent need of faster comparison - even a simple export of the revisions in text-format/unix-diff would increase the value of this basically nice front-end feature a lot. We are all used to classic diff-tool-options like scrollbar-marking, jumping to next change and mainly…

    4 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  8. NAT Gateway Compatibility

    Make Databricks Workspaces Compatible with NAT Gateways. Currently when you associate a NAT Gateway with the public subnet of the Databrick workspace clusters will not start and raise the following error:

    Azure error code: AzureVnetConfigurationFailure(SubnetWithNatGatewayAndBasicSkuResourceNotAllowed)
    Azure error message: Encountered error while attempting to create NIC within injected virtual network. Details:
    NAT Gateway cannot be deployed on subnet containing Basic SKU Public IP addresses or Basic SKU Load Balancer.

    3 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    2 comments  ·  Flag idea as inappropriate…  ·  Admin →
  9. Create PaaS-managed Databricks Service Tags for use in NSGs instead of VirtualNetwork

    The default, highest priority NSG rule created by Databricks in PaaS-managed NSGs is "permit any protocol from VirtualNetwork to VirtualNetwork". The rule is enforced with a network intent policy and cannot be overridden. In cases where UDRs with default routes (destination 0.0.0.0/0) are attached to the Databricks subnets, or the VNets learn a default route from a VNet gateway, the NSG effectively becomes a "permit any any" rule. Databricks nodes have public IP addresses, which creates an unreasonable surface attack area when combined with a wide open NSG. Having a PaaS-managed service tag to permit required internal network access without…

    2 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  10. Databricks-Connect with multi configuration on same machine

    Is the databricks-connect library (https://docs.microsoft.com/en-us/azure/databricks/dev-tools/databricks-connect) have option to support multiple configuration? Our scenario require switch from one configuration to another one when necessary. The current doc seems indicate it only support one global settings and we tested to Anaconda's environment and settings configured under one environment been used by another one.

    1 vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  11. Modify the /secrets/scopes/create API so we can add Key Vault backed scopes

    Currently, we can only add a Databricks-backed secret Scope to the workspace using the REST api.

    I want to deploy workspaces programatically, so I would like to be able to deploy a Key Vault backed secret Scope through the API.

    2 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  12. Support changing the VNet Address space or subnet CIDR of an existing Azure Databricks workspace

    Modification of the CIDR is quite common, especially when a proof of concept (POC) is a success and you want to go further and connect it to a corporate network.

    RFC 1918 addresses are a real challenge to maintain, and when you perform a POC, you cannot quickly obtain a / 16 or / 24 for POC as requested by the Databricks virtual network injection function.

    For more information, I missed the URL below saying it was not supported and the impact I saw was that the spark cmdlet were no longer working (dubutil were).
    https://docs.microsoft.com/en-us/azure/databricks/kb/cloud/azure-vnet-jobs-not-progressing

    5 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Strong Feedback  ·  Flag idea as inappropriate…  ·  Admin →
  13. Access Azure Certificate Stored in Azure KeyVault inside Databricks

    What is option to access Certificate stored in Azure KeyVault. Need some feature like dbutil.secrete.

    3 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Strong Feedback  ·  Flag idea as inappropriate…  ·  Admin →
  14. Show the rowcount for rows returned in Notebooks

    The notebook shows the data returned but not the number of rows - otherwise I have to rerun the command with a select count(*) on it.

    This should be easy to implement

    6 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  15. Support Single-Sign On with custom identity providers

    Databricks on AWS already supports multiple identity providers for SSO. Check https://docs.databricks.com/administration-guide/users-groups/single-sign-on/index.html.

    There is no reason why Azure Databricks should be limited only to AAD for SSO.

    5 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  16. Enable Azure AD credential passthrough to ADLS Gen2

    Add a feature of passing AAD credential of the user working with Azure Databricks cluster to Azure Data Lake Store Gen2 filesystems to build secure and enterprise data lake analytics on top of ADLS Gen2 with Databricks. This feature should not be limited to the high concurrency clusters, since these clusters do not support many features (including Scala), and because a typical advanced analytics scenario for the enterprise is to run a dedicated cluster for a small group of departmental analysts (Standard clusters are the most popular).

    88 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    4 comments  ·  Flag idea as inappropriate…  ·  Admin →
  17. Equip a lock button on notebook for GitHub(Devops) management

    When users use Azure Data Factory to invoke Databricks notebook activity, it will use current code instead of committed version of notebook. it would be recommended to add the feature to lock the current notebook, and let Azure Data Factory use committed one in Github or Devops.

    3 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Strong Feedback  ·  Flag idea as inappropriate…  ·  Admin →
  18. Azure Databricks should have more granular level access permissions

    Currently, Azure Databricks Workspace provides only 4 options for access permissions.


    1. Workspace Access Control

    2. Cluster and Jobs Access Control

    3. Table Access Control

    4. Personal Access Tokens.

    These permissions give more access to user than requirement.

    Would it be possible to create more permissions under Access Control ?

    Specifically for below requirements

    Access to view data sources
    Access to view Databrick runs to check failures and their reasons
    Access to view data changes and deployment issues
    Access to troubleshoot data processing failures caused by Data issues, System errors in Databricks workspace

    21 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Strong Feedback  ·  Flag idea as inappropriate…  ·  Admin →
  19. Access for internal/private maven repositories from Databricks

    Need a support to pull the libraries from internal/private maven repositories from databricks.

    2 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Strong Feedback  ·  Flag idea as inappropriate…  ·  Admin →
  20. H2O Flow or H2O Flow Equivalent in Azure

    The H2O Flow GUI works with Databricks AWS but not Azure. It auto populates a huge amount of interactive charts and metrics for post model evaluation via a local host URL. I would like to request H2O Flow be enabled in Azure, but further request that Databricks add many more interactive auto generated content like H2O Flow does. This includes interactive ROC curves where you can traverse the confusion matrix by selecting any point on the curve, cross validation data set score, and variable importance.

    1 vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
← Previous 1 3 4 5 6 7 8
  • Don't see your idea?

Azure Databricks

Categories

Feedback and Knowledge Base