Currently, Delta lake table is not supported to be queried using SQL on Demand. Adding this functionality would help to greatly use SQL on-demand for analyzing the delta lake table.
This will save time to query any Spark database delta lake table without spinning spark pools.64 votes
Enable a way to refer and add a custom .jars when initiating a Synapse Spark cluster to enable Synapse notebooks to import from the .jars38 votes
Would love to have support for the language R in the notebook experience of Synapse Studio.24 votes
Please add the OPENROWSET functionality dedicated pools.
1. More (and faster) parser options over External File Format such as row delimiter
2. Can auto-infer schema
3. More convenient to define the file format directly
4. syntactical harmony between serverless and dedicated.23 votes
Today we can query data stored in parquet files on ADLS. It would be fantastic to extend this to support the new "Delta Lake" file format recently open-sourced by the DataBricks team ( see https://delta.io )
This would allow us to take advantage of ACID guarantees that the delta format brings to the data lake.233 votes
Is it possible to add the use of global parameters in Synaps management studio? Exactly the same as in Azure data Factory. This makes life so much easier when working with differente Environments.15 votes
The "Common Data Model" (CDM) format is becoming increasingly popular. Therefore it would be important that this connector not only exists in the ADF, but that it is also possible to read and write (via CETAS) in CDM directly from SQL on-demand.
SELECT * FROM
BULK STORAGE ACCOUNT,
FORMAT = CDM
) AS [r];
CREATE EXTERNAL TABLE cdm.cdm_table
LOCATION = path_to_folder,
DATA_SOURCE = external_data_source_name
FILE_FORMAT = CDM
) AS SELECT * FROM source19 votes
Add the ability to change the connection policy, like Microsoft.sql
I want to apply Redirect to accesses from outside Azure to optimize throughput.9 votes
Most Azure products and services have a firewall that can be configured to restrict access. A valuable feature of configuring firewalls appropriately and troubleshooting connectivity issues is being able to see which connection attempts are blocked by the firewall rules. This feature is very common on other commercial firewall products. Please add this ability so that users can review firewall logs for products like SQL Server and others.10 votes
Request to shorten execution time on pipeline for more than 2 notebooks on Synapse Studio.
From my observation, the reason why it takes time to run each notebook might be each needs to establish Spark session that takes time.
Seeing parameters on Spark pool, there are not useful parameters to achieve the requirement.
If same one spark pool is being used on pipeline across multiple Notebooks, please consider same session can be used to shorten total execution time.7 votes
https://cloudblogs.microsoft.com/sqlserver/2019/11/07/new-in-azure-synapse-analytics-cicd-for-sql-analytics-using-sql-server-data-tools/ describes how SQL Server Data Tools support for Synapse not mentioning that this is only for Dedicated Pools. We need similar support for Synapse Serverless as well otherwise CI/CD will be hard to handle.8 votes
The ability to run EXPLAIN PLAN syntax for serverless SQL pool queries Azure Synapse Analytics.6 votes
when executing DROP External Table statement, only the metadata of the table gets deleted not the external file in the ADLS directory and we have to manually delete the files from the storage account.
would be nice if this is taken care automatically when dropping the external table7 votes
Allow private hosted pip repositories in Synapse,
private packages are commonly used and currently, the workaround of copying all the code is pretty obnoxious7 votes
Need %pip installer within the workspace notebook, so we can install libraries as needed immediately
Need %pip installer within the workspace notebook, so we can install libraries as needed immediately8 votes
At the scenario where we have a notebook activity in a pipeline, when we run this pipeline, to have the option to see the notebook which run. A similar behaviour exists at ADF for databricks. When running a databricks notebook from ADF, at the output there is a link to the run notebook.5 votes
Parquet number of files, file size, and parquet row group size greatly influence query speeds of Synapse serverless. Yet parquet file creation through CETAS cannot be configured in any way except for the type of compression. Moreover CETAS is not consistent in parquet file creation; the generated parquet file size and the number of files created varies wildly. I've seen CETAS queries return a single 1.5GB file, or dozens of 1MB files. Given this behavior, it is very hard to use CETAS as part of a production data pipeline, at the moment it is more of a prototyping tool.6 votes
We do plan to improve CETAS when it comes to partitioning the output and balancing the size of files.
We didn’t plan to allow specifying number of files, we are interested to see why do you need that?
Subset of users to drop Serverless external tables via standard SQL optionally and have the underlying files deleted
We need the ability for a subset of users to Subset of users to drop Synapse Serverless external tables via standard SQL and have the underlying files deleted via standard SQL without knowing about the underlying storage blob containers. SQL command option (DROP FILE) that will allow a user to drop a table without having to manually delete a file from storage.6 votes
GRANT IMPRSONATION ON USER::User1 TO User2 is not working for the serverless pool in synapse although Microsoft’s documentation says it is applicable. overall Grant Database Principals doesn't work in Serverless pool.4 votes
- Don't see your idea?