Azure Databricks

Azure Databricks is an Apache Spark-based analytics platform optimized for the Microsoft Azure cloud services platform. Designed with the founders of Apache Spark, Databricks is integrated with Azure to provide one-click setup, streamlined workflows, and an interactive workspace that enables collaboration between data scientists, data engineers, and business analysts.

We would love to hear any feedback you have for Azure Databricks.
For more details about Azure Databricks, try our documentation page.

  1. Let Notebook attach n run to a terminated clusters while another cluster is running

    I connect my notebook to a cluster that is currently running and run some commands. Then I realize that I connected to the wrong cluster. So I attach it to another cluster that is currently shutdown and then run the command again. I expect that the shutdown cluster will auto start and will run my command but instead it tells me that there is another cluster that is currently running and the newly attached cluster is shutdown. It asks if it should attach to the active cluster and run the command there. The two options for this question is "Cancel"…

    1 vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  2. Multiple Cron schedules for one job

    Many simple schedules cannot be configured for a single job in Databricks due to the limitations of cron schedules. E.g. running a job every 40 minutes. Multiple schedules could provide such a frequency, but today, that would require having duplicate copies of the same job with different schedules.

    1 vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Strong Feedback  ·  Flag idea as inappropriate…  ·  Admin →
  3. Workspace, cluster, job ownership changes

    It is unfortunate situation where there is no possibility to change ownership of:
    * workspace
    * cluster
    * job
    People are leaving companies, all of the above reside as owned by them. I understand that some of the issues regarding individuals owning resources can be avoided with SP, but it can be a solution for new deployments and still can be inconvenient in some use cases. There should be an option to change ownership. Lack of this feature also may produce an issue for operation of clusters and jobs:
    * cluster whose owner is removed from workspace will not be…

    2 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  4. Cluster status API - check if cluster is up and running

    Currently there is no way to use API to verify if a given cluster is running or not. It causes a lot of trouble for tools like PowerBI to verify upfront if queries can run or not. There is such information available on the GUI, so we would gladly welcome same functionality to be with the REST API.

    2 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  5. Add transport of all Cluster events to Log Analytics workspace

    Currently diagnostic setting for a workspace allows only limited number of events to be transported into Log Analytics DatabricksClusters category. Events that are missing ie:


    • cluster termination

    • cluster auto-sizing related event (adding machines, expanding disks and opposite)

    In general it would be more than welcome to have all of the information available in cluster Event log to be made available in Log Analytics as well

    2 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Strong Feedback  ·  Flag idea as inappropriate…  ·  Admin →
  6. Moving multiple cells up/down together

    In jupyter notebook, you are able to select multiple cells and easily move them up/down together. This functionality is not possible right now in databricks.

    1 vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  7. Enable Azure AD credential passthrough to ADLS Gen2 from PowerBI

    At present, the PowerBI connector uses token authentication. It would be ideal if this used AD auth, and that auth was passed down to the underlying source (Data lake gen2).

    This is currently only available within the workspace using High Concurrency clusters. But we would like non-technical users to use PowerBI

    1 vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Strong Feedback  ·  Flag idea as inappropriate…  ·  Admin →
  8. Unable to install azure-eventhub PYPI package

    Hi,

    Tried to install azure-eventhub PYPI package in databricks cluster but ended up in error

    Could not find a version that satisfies the requirement azure-eventhub (from versions: )
    No matching distribution found for azure-eventhub

    Does databricks have restrictions in installing libraries from PYPI? I could see the package in PYPI but why databricks is not able to find it

    1 vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  9. Add diagnostic logging configuration to ARM template

    It is possible to automatically add control plane diagnostic logging from Azure Databricks to Log Analytics. It is not possible to automate this process and must be done through the Portal. It would be great if this could be included in the ARM template or via Powershell.

    1 vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  10. 2 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Strong Feedback  ·  Flag idea as inappropriate…  ·  Admin →
  11. Support changing the VNet Address space or subnet CIDR of an existing Azure Databricks workspace

    Modification of the CIDR is quite common, especially when a proof of concept (POC) is a success and you want to go further and connect it to a corporate network.

    RFC 1918 addresses are a real challenge to maintain, and when you perform a POC, you cannot quickly obtain a / 16 or / 24 for POC as requested by the Databricks virtual network injection function.

    For more information, I missed the URL below saying it was not supported and the impact I saw was that the spark cmdlet were no longer working (dubutil were).
    https://docs.microsoft.com/en-us/azure/databricks/kb/cloud/azure-vnet-jobs-not-progressing

    4 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Strong Feedback  ·  Flag idea as inappropriate…  ·  Admin →
  12. runtime versions

    Databricks needs to better test the compatibility between different runtime versions and the various packages.

    1 vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Strong Feedback  ·  Flag idea as inappropriate…  ·  Admin →
  13. Support importing workspace libraries in the REST API

    Workspaces can contain notebooks, folders, and libraries. However, the REST API only support importing notebooks and .dac folders into a workspace. This feature request is to support importing libraries (e.g. .whl files) into a workspace.

    1 vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  14. Support Single-Sign On with custom identity providers

    Databricks on AWS already supports multiple identity providers for SSO. Check https://docs.databricks.com/administration-guide/users-groups/single-sign-on/index.html.

    There is no reason why Azure Databricks should be limited only to AAD for SSO.

    5 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  15. backquotes do not work as documented for SQL

    This documentation example, from: https://docs.microsoft.com/en-us/azure/databricks/getting-started/spark/dataframes#run-sql-queries

    select State Code, 2015 median sales price from data_geo

    Fails on

    SELECT CpuUtilization Average FROM t09c0e51a

    as does this:

    asparquet = asparquet.withColumnRenamed("CpuUtilization Average", "CpuUtilizationAverage")
    as_parquet.createOrReplaceTempView('t09c0e51a')
    sqlContext.sql("SELECT CpuUtilizationAverage FROM t09c0e51a").take(1)

    1
    asparquet = asparquet.withColumnRenamed("CpuUtilization Average", "CpuUtilizationAverage")
    2
    as_parquet.createOrReplaceTempView('t09c0e51a')
    3
    sqlContext.sql("SELECT CpuUtilizationAverage FROM t09c0e51a").take(1)

    org.apache.spark.sql.AnalysisException: Attribute name "CpuUtilization Average" contains invalid character(s) among " ,;{}()\n\t=". Please use alias to rename it.;

    1 vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  16. Tags should be inherited from the deployment resource group

    We've been trying to utilize tags in our environment to identify resources for billing purposes. We've put a policy in place that forces new resources to inherit this tag from its resource group if it wasn't given one explicitly. Most Azure resources are handled fine, but we've discovered that the tags don't get inherited by the resources created by databricks in the managed resource group. Instead, we have to individually assign the tags to each Databricks cluster. I'd like to see new clusters that inherit the tags of either the Managed Resource Group or the Databricks service itself.

    1 vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  17. job creation automation

    Is there any way to create ADB job in automated way. Currently I have to create jobs manually in all the environments.

    1 vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  18. Show the rowcount for rows returned in Notebooks

    The notebook shows the data returned but not the number of rows - otherwise I have to rerun the command with a select count(*) on it.

    This should be easy to implement

    6 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  19. Azure Databricks should be FedRamp compliant across all US regions

    Azure Databricks seems to have every compliance certification other than FedRamp. I would think it would not be an issue to become FedRamp compliant, which would allow government data to be transformed in the platform.

    4 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  20. Can we access Event hub in Databricks using service principal

    Event hub document shows that we can access it using service principal and also I see that, we can access event hub using service principal using python libraries.

    Can we access Event hub in Databricks using service principal instead of SAS

    1 vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
← Previous 1 3 4 5
  • Don't see your idea?

Azure Databricks

Categories

Feedback and Knowledge Base