Update: Microsoft will be moving away from UserVoice sites on a product-by-product basis throughout the 2021 calendar year. We will leverage 1st party solutions for customer feedback. Learn more here.

Azure Synapse Analytics

We would love to hear your ideas for new features for Azure Synapse Analytics. Below, enter a new idea or upvote an existing one. The Synapse engineering team pays attention to all requests.

If instead you need a technical question answered or help, try the these options: DocumentationMSDN forum, and StackOverflow. If you need support, please open a support ticket.

  • Hot ideas
  • Top ideas
  • New ideas
  • My feedback
  1. [In Preview] Hash distribute fact table on multiple columns

    A large fact table often has multiple join keys to multiple dimension table, so while it can be opitmised for one join it always creates data movement for other joins. The hash distribution should accept multiple columns so all joins are optimised and create no data movement for dimension joins.

    136 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    9 comments  ·  SQL/Performance  ·  Flag idea as inappropriate…  ·  Admin →
  2. Enable Dynamic compute - scaling up the DW should not kill any existing query

    Scaling up/down the warehouse kills all ongoing sessions.
    Adding or removing compute capacity should be truly dynamic. This means that when I add more compute - by scaling up - the DW should be able to absorb that additional compute capacity without killing existing queries.
    Maybe that compute can be brought into play for the new queries that get submitted post addition of the compute.

    Thanks,
    Mayank

    82 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  SQL/Performance  ·  Flag idea as inappropriate…  ·  Admin →
  3. Support table variables

    Prior to data analysis it is required to collect some initial info and store it in a table format. Creating and dropping a table or a temp table takes too much time (seconds) since it has to be distributed over multiple nodes. Having table variables would solve this performance issue.

    63 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    under review  ·  0 comments  ·  SQL/Performance  ·  Flag idea as inappropriate…  ·  Admin →
  4. Better Database & Query Monitoring through Azure Portal

    Enable a better experience when it comes to monitoring and analysis of queries and DWH/query performance. Currently the SQL DW blade in the Azure Portal does only show very basic information about queries and their execution as well as DTUs with no possibility to customize the charts or the (query) lists. Only the last 50 queries are shown in this list, there are no overall reports and statistics (long running queries), there is no possibility to filter this list or show other metrics. Please enable a similar experience for monitoring of query and DB performance like SQL DB offers (Query…

    40 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    1 comment  ·  SQL/Performance  ·  Flag idea as inappropriate…  ·  Admin →

    We are actively improving our monitoring experience. We have future improvements on our road map such as Query Data Store and integrating with Azure Monitor (logs and metrics) for near real time troubleshooting in the Azure portal. If anyone has any other feedback, please elaborate on your scenario on this thread and have your team continue to vote. Thank you for your continued support!

  5. Resource class override

    Having a fixed resource class assigned to a user is too rigid. Most queries can be executed efficiently with a small resource class, but certain operations (e.g. rebuild index) should be run with a larger RC. I need to switch between several user accounts just to be able to have appropriate resources for whatever operation or query I need to run. Users assigned to xlargerc can only execute a single query at a time.

    Would be great to be able to set/change the resource class used for executing each statement. Alternatively, being able to execute as a different user could…

    40 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    4 comments  ·  SQL/Performance  ·  Flag idea as inappropriate…  ·  Admin →

    We announced workload isolation via workload groups for public preview at Ignite in Nov., 2019. Workload groups allow you to create your own custom resource classes (among other things). Check out workload classification that allows you to assign requests by more than just login information too!

  6. Support rebuilding DW indexes with Online = ON so the table is accessible during the rebuild

    This works:
    ALTER INDEX ALL ON VulnerabilityScope REBUILD
    But, when the rebuild takes place, the entire table becomes locked from any query--even selecting one row. Users complain when the table is inaccessible during the 30 minute rebuild.

    This currently returns an error:
    ALTER INDEX ALL ON VulnerabilityScope REBUILD WITH (ONLINE = ON)
    Parse error at line: 1, column: 48: Incorrect syntax near 'ONLINE'.

    Please support rebuilding DW indexes with Online = ON

    36 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    2 comments  ·  SQL/Performance  ·  Flag idea as inappropriate…  ·  Admin →

    Thank you for the feedback folks. For now, you can use Alter Index reorganize as the online operation when rebuilding your CCI . We will reach out when this is on our roadmap. Please describe your scenario for this capability below and continue to vote. Thanks for your patience.

  7. Support for updating statistics on a single partition

    This would increase update statistics performance for users who are currently loading data into a single partition and updating statistics afterwards.

    Otherwise with growing data size, updating statistics takes longer after every load.

    34 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    under review  ·  3 comments  ·  SQL/Performance  ·  Flag idea as inappropriate…  ·  Admin →
  8. Expose DMVs that display the actual execution plan (sqlplan) of a query on a node after the query executes

    Currently, DBCC PWD_SHOWEXECUTIONPLAN only displays the estimated query plan of the step being executed on a compute node, while the query is running. There is no way to see the actual execution plan of the query step on the node.

    This prevents us from seeing the details of the query being executed such as the stats on how many rows were estimated vs actual rows returned, or rowgroups eliminated by the ColumnStore index, or any gross under/over estimates in the row counts owing to missing/bad statistics.

    33 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    1 comment  ·  SQL/Performance  ·  Flag idea as inappropriate…  ·  Admin →

    Thank you for voting for this feature! We are aware of this scenario and are looking into ways of supporting this and improving our query troubleshooting experience. In the meantime, stay tuned for an update and please continue voting for this feature.

  9. Enable distribution elimination in the generated query plan if the query predicate contains the distribution key column as a filter

    Say we have the following table:

    CREATE TABLE [dbo].[Foo]

    (

       [columnWithGuids] [uniqueidentifier] NULL,
    
       [column1] [int] NULL,
    
       [column2] [bigint] NULL,
    
       [ingressDay] [date] NULL
    

    )

    WITH

    (

       DISTRIBUTION = HASH ( [columnWithGuids] ),
    
       CLUSTERED COLUMNSTORE INDEX,
    
       PARTITION
    
       (
    
              [ingressDay] RANGE RIGHT FOR VALUES (N'2016-01-01T00:00:00.000', N'2016-01-02T00:00:00.000')
    
       )
    

    )

    And we have a query:

    SELECT

       SUM([colum1]) AS [Total]
    
      ,[ingressDay]
    

    FROM [dbo].[Foo]

    WHERE [columnWithGuids] = '40300bab-03aa-8d51-7ecb-904dfaf47ec7'

    GROUP BY

         [ingressDay];
    

    Currently, Azure SQL DW does not eliminate distributions to interrogate based on a predicate. Which means, that in the above case, even though only 1 distribution will always have the data we're looking for, it will…

    18 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    under review  ·  1 comment  ·  SQL/Performance  ·  Flag idea as inappropriate…  ·  Admin →
  10. Functionality to upload data to ADW where upload queries do not count in 32 concurrent queries limit

    Today we use PDW/APS and when we upload data to PDW using DWLoader, it does not count towards 32 concurrent query limitations, this makes sure that users have all of query window to use datawarehouse.
    However in ADW, when we are ingesting data with Azure data Factory or manually using Polybase, since technically its a query, it counts towards 32 concurrent query limit.
    Would be great to have a feature similar to PDW.

    16 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  SQL/Performance  ·  Flag idea as inappropriate…  ·  Admin →
  11. Performance with respect to dynamic SQL

    In our solutions we load data from csv-files into SQL DW internal tables using polybase/external tables.

    Per roll-out we face the challenge that the schema (columns) in the csv may differ for the same table. Therefore we implemented a logic to lookup the column name and data type from a dictionary table and we create the external and internal table schemes dynamically. As we have round about 500 tables with up to 20 columns per table automating this process is the only way to go.

    The issue we face is that compared to an on premise SQL Server the dynamic…

    16 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    2 comments  ·  SQL/Performance  ·  Flag idea as inappropriate…  ·  Admin →
  12. Provide resource usage by query (CPU, Memory, TempDB, I/O)

    Provide actual resource usage data by query detailing CPU, Memory (allocated/Used), TempDB usage, and I/O.
    This will provide metrics to make measurable performance improvements and make better DWU and Workload Management configuration management decisions.

    14 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    2 comments  ·  SQL/Performance  ·  Flag idea as inappropriate…  ·  Admin →
  13. CREATE SCHEMA causes pervasive blocking

    When CREATE SCHEMA is issued it is blocked by active work and blocks (most) all new work.

    CREATE SCHEMA should be new transparent to the operations of a database.

    13 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  SQL/Performance  ·  Flag idea as inappropriate…  ·  Admin →
  14. Provide insights about resource usage in Synapse Serverless

    Currently it's fully intransparent for users of Synapse Serverless how much resources are used to execute a query and when/why requests are queued because current resources are used-up and other queries have to be finished until the new query can be started. That leads to situations where very fast commands don't execute for a long time just because another command is eating up resources. How can we understand why a command is queued?

    7 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    under review  ·  0 comments  ·  SQL/Performance  ·  Flag idea as inappropriate…  ·  Admin →
  15. Workflow capability within Azure Data warehouse to run parallel queries.

    To maximise the parallel execution and MPP architecture, it will be great to have a workflow capability that will allow users to run queries in parallel against a defined Resource Class.
    I appreciate this could be against the SQL engine design but looking at the architecture we might be able to do something like this

    workflow-Parallel (ResourceClassUser = XlargeRCuser, OptionalAllocMaxMemory = 0.5)
    {

    Exec SPTranform1;
    Exec SP
    Tranform2;
    Exec SPTranform3;
    Exec SP
    Tranform4;
    }

    where 4 sprocs will run in parallel with defined RC and max memory allocation in percentage/ratio to the total max available for the RC,…

    7 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    1 comment  ·  SQL/Performance  ·  Flag idea as inappropriate…  ·  Admin →

    Thanks for the suggestion. We announced workload isolation with workload groups for preview at Ignite Nov., 2019. Workload groups give you a lot more flexibility to define resource usage and thus concurrency. Please have a look at workload groups and share more feedback.

  16. 7 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  SQL/Performance  ·  Flag idea as inappropriate…  ·  Admin →
  17. Allow IDENTITY columns to be used in Hash Distribution

    Since May 2020 the ability to update data in the distribution column has been GA, shouldn't this enable using IDENTITY columns in table distributions?

    When using IDENTITY surrogate keys it would be ideal to distribute on the column (i.e. distribute on ProductSkey across all tables).

    6 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    1 comment  ·  SQL/Performance  ·  Flag idea as inappropriate…  ·  Admin →
  18. Gen2 not all it was touted to be

    Our organization was assured that Gen2 would do wonders for our big data performance. It didn't. We were told that Gen2 adds in data caching that will make things run faster. It doesn't. To make use of data caching, we added clustered columnstore indexes to our tables that have an excess of 100 million records, and our data analytics now run a few minutes longer, not shorter; our result sets are too large to cache and reuse. Looking at the data warehouse usage stats, compute used doesn't exceed 40 DWUs, but the only way to get faster runtimes is to…

    6 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  SQL/Performance  ·  Flag idea as inappropriate…  ·  Admin →
  19. Workload Isolation in Azure Synapse

    Workload isolation in Synapse currently reserve resource per request on the basis of REQUESTMINRESOURCEGRANTPERCENT and REQUESTMAXRESOURCEGRANTPERCENT which did not differentiate between a high cost query and low cost query ( short query) . We want a mechanism which can allocate the resource in a workload on the basis of Query cost this will help in optimal resource utilization within a workload group.

    5 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  SQL/Performance  ·  Flag idea as inappropriate…  ·  Admin →
  20. materialized view not used on IN clause with more than 64 values

    Materialized views help on queries until there's a where clause with an IN clause with more than 64 values. Then the query is internally rewritten and the materialized view isn't used. Please enhance materialized views to be able to support use in this scenario.

    To reproduce this, create a table distributed by SalesOrderNumber:

    create table dbo.FactInternetSales_DistBySalesOrderSummary
    with (distribution=hash(SalesOrderNumber))
    as
    select *
    from dbo.FactInternetSales;

    Then create a materialized view:

    create materialized view dbo.FactInternetSalesSalesOrderSummary
    with (distribution=hash(SalesOrderNumber))
    as
    select SalesOrderNumber, ProductKey, sum(SalesAmount) as SalesAmount
    from dbo.FactInternetSales
    DistBySalesOrderSummary
    group by SalesOrderNumber, ProductKey;

    Check the explain plan on a simple query with a small…

    4 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    2 comments  ·  SQL/Performance  ·  Flag idea as inappropriate…  ·  Admin →
← Previous 1
  • Don't see your idea?

Feedback and Knowledge Base