Azure Synapse Analytics
-
JDBC connection to Spark Pool
Please support JDBC connection to Synapse Spark Pool.
25 votes -
Add Support for delta lake v0.7 and spark 3.0 in spark pool
Delta lake v0.6.1 doesn't support much of the ACID functionality. It would be great to upgrade the delta lake version and Spark version to utilize functionality as supported by data bricks.
10 votes -
Need %pip installer within the workspace notebook, so we can install libraries as needed immediately
Need %pip installer within the workspace notebook, so we can install libraries as needed immediately
7 votes -
Make available some utilities in Spark Pools as already existing in Databricks
It would be good to have some spark utilities as already available in databricks, like executing one notebook from another notebook and passing it parameters.
Example:
dbutils.notebooks.run('NotebookName', 3600, parameters)This is very needed to have a dynamic notebook which can trigger the execution of another notebook.
7 votes -
Apache Spark - Allow Sharing a cluster across multiple users
Currently, an entire new cluster is spun for every user who starts an Apache Spark session using Notebooks. Please add the ability to share a single physical cluster across multiple users. Spinning up a new cluster (with a minimum of 3 nodes) for every user is very expensive.
6 votes -
Allow private hosted pip repositories in Synapse
Allow private hosted pip repositories in Synapse,
private packages are commonly used and currently, the workaround of copying all the code is pretty obnoxious5 votes -
Support for Key Vault secret access from Spark/Python Notebook
My customer needs access to their secrets stored in Azure Key Vault from within a Spark notebook. These credentials are used to access their cognitive services keys in order to complete their data engineering process. The libraries installed in Synapse conflicts with the documentation provided by Key Vault.
https://docs.microsoft.com/en-us/azure/key-vault/secrets/quick-create-python?tabs=cmd
3 votes -
Connect Spark Pool to Power BI
Please support the ability to connect Spark pool to Power BI by exposing the cluster connection information for Spark pool.
Currently, Spark data needs to be loaded into a different data source such as Azure SQL Data Warehouse (ADW) before Power BI can use the data.
However, Power BI has the capability to connect to Spark (https://docs.microsoft.com/en-us/azure/databricks/integrations/bi/power-bi#step-2-get-azure-databricks-connection-information).
Adding ADW as an intermediate step from Spark to Power BI just adds unnecessary delay in syncing the data between Spark and ADW.
Potentially relevant discussion post: https://feedback.azure.com/forums/307516-azure-synapse-analytics/suggestions/40706374-jdbc-connection-to-spark-pool
3 votes -
Connecting remotely to Azure Spark Pool for notebook hosting.
We would like to host a notebook service on top of Azure Synapse Analytics, with a custom frontend.
This requires the possibility of connecting the notebook to a Python-kernel running in a remote Azure Spark instance.
According to the answer provided here in the Microsoft forums, this is currently not available.
2 votes -
Better error messaging for Synapse Spark Pools
Today I'm getting an error message across multiple tenant workspaces (tested to make sure it wasn't a setup issue) when trying to start spark clusters, "Failed to start cluster:". Doesn't really help identify what the issue might be. Last week I received an error "CLUSTERINTERMINALSTATEBEFORE_READY". Neither of them very useful, and both of them resolved on their own after waiting a while.
Better error reporting, maybe some documentation of common errors and resolution, and some stability on the MS backend would go a long way to increase adoption.
2 votes -
add support for wheel library distributions
support wheel distribution azure data bricks currently support these:
2 votes -
Setting Custom Credentials for ADLS/ABS
I would like to be able to use custom credentials when authenticating to ADLS or ABS when I'm writing out Spark dataframes out to those locations. This is a feature in Databricks as shown here (https://docs.databricks.com/data/data-sources/azure/azure-datalake.html). Right now, when I write out spark dataframes in Azure Synapse with Spark on Cosmos, it defaults to using my user credentials but there are scenarios where I would want to use a service principal. This feature would be for those scenarios.
1 vote -
support for running notebooks from within notebooks usign the %run magic
Ipython supports the use of the %run magic to run notebooks from within a notebook... having this support in the synapse spark environment would unblock migration from other environments
1 vote
- Don't see your idea?