Do you have a comment or suggestion to improve SQL Server? We’d love to hear it!

Provide a hint to force intermediate materialization of CTEs or derived tables

When working with SQL Server CTEs or derived tables, the query optimizer is free to in-line the inner queries, thereby changing the physical processing characteristics of the query. This is generally a good thing, but in some cases we (database developers working with the product) can make a better decision that the query optimizer and need to override it. In many cases this is handled (join and query hints), but one place that we currently have no control is intermediate materialization of derived tables. There are many cases in which materializing the inner set first produces a much better plan, with greatly reduced I/Os. Example, from AdventureWorks:

from humanresources.employee x
select p.managerid, count(*)
from humanresources.employee p
group by p.managerid
) y (managerid, thecount) on y.managerid = x.managerid

... This query produces 767 logical reads. Now I can force intermediate materialization using TOP + ORDER BY:

from humanresources.employee x
select top (2147483647)
p.managerid, count(*)
from humanresources.employee p
group by p.managerid
order by p.managerid
) y (managerid, thecount) on y.managerid = x.managerid

... and now I get 4 logical reads.

14 votes
Sign in
Sign in with: Microsoft
Signed in as (Sign out)

We’ll send you updates on this idea

AdminMicrosoft SQL Server (Product Manager, Microsoft Azure) shared this idea  ·   ·  Flag idea as inappropriate…  ·  Admin →

Upvotes: 174

<=-=Oct 12 2006 2:21PM=-=>

Dear Adam,

Thanks for your feedback. I think this idea has quite a bit of merit. You would like a way to tell the query processor to evaluate query subexpressions in a constrained order. There is no perfect workaround right now. You could use a multi-statement TVF, but that is hard to program and hard to read. Your workaround requires an extra sort. We’ll consider this as an improvement for a future release.


<=-=Feb 4 2010 8:40AM=-=>

SELECT x.EmployeeID,
FROM HumanResources.Employee AS e
WHERE e.ManagerId = x.ManagerID
) AS theCount
FROM HumanResources.Employee AS x

is also 4 logical reads – no materialization needed.

In general, I’m not so sure about the need for this. If I truly need to materialize an intermediate result (which is typically quite small) I’m happy using a table variable or temporary table. A materialized CTE or derived table feature would probably just be abused, and some would no doubt go on to decry the lack of statistics or indexability…

I quite like the CTE/derived table as a sort of inline view, with the optimizer free to do clever stuff with the overall plan. While this facility might result in better plans on a few occasions, we have a workaround, and the change might affect the wider quality of plans produced.

I may well be wrong though ;c)

<=-=Jan 2 2012 3:59PM=-=>

Having this kind of hint would be very helpful for tuning complex CTE queries (especially self-joins on a CTE) where SQL Server today always chooses sub-optimal plans with huge I/O cost.

@SQL_Kiwi – the temp table workaround is usually much slower (in my experience, at least 30% slower) than an inner query that’s been materialized. Table variables may be faster, but they prevent SQL Server from using a parallel query plan which negates any speed benefit.

Of course, in addition to a hint, it’d be ideal if SQL’s optimizer also got smarter about the I/O cost involved in repetitive evaluation of an inner query, to ensure that there were fewer cases where the hint was needed.

<=-=Jul 27 2012 9:22PM=-=>

In my experience, some sort of intermediate materialization (I generally force it by using temp tables), has been the trick to solving executions plans generated by sql server — usually the problem arises with multiple ctes or table functions in joins with each other. (One query took over 48 hrs, but with intermediate materialization, took 18 seconds.)
But perhaps it could be solved by better optimization (or an option to let sql server take a longer time to optimize? as in computer chess)

<=-=Nov 15 2012 5:54PM=-=>

I found this “bug” when using various grouping sets over aggregated data from a base set, all within CTEs. What should have been a few seconds became 40 minutes when I added the 3rd level CTE. In my case, 100,000 rows aggregated to just 120 rows within the 2nd-level CTE and 8 rows in the final CTE. The final CTE was used twice in the SELECT, once as the left side of an OUTER Join, then once more for a reason I don’t quite recollect. Rewrote the CTE’s into Table-variables and seconds later was done…

From inspection of the Explain Plan, the base query was being executed repeatedly, despite the fact that nothing warranted it. The optimizer should be smart enough to see when each CTE needs reevaluation or can be executed, then materialized internally and presented as a stable set of data to the next level CTE.

Definitely +1.

<=-=Dec 27 2012 1:55PM=-=>

I ran into this issue today and was able to use the “TOP (2147483647)” trick to reduce the run time of a complex query from hours down to seconds. I could not have used a temp table or table variable here – this needed to be a single query due to an application restriction.
I believe this hint would be extremely useful in a limited but significant number of situations. I agree with Paul that it would likely be abused/misused by some people (as many hints are), but I don’t think that’s a good reason to not implement it.

<=-=Apr 4 2013 4:46PM=-=>

Please consider the ability to specify the materialized table’s PK with the materialization hint else I’m still out of luck for some cases. I am constantly using temp tables to work around this issue (e.g. no good for use within a view). Thanks!

<=-=Apr 11 2013 10:13PM=-=>

I spend so much time working around the optimiser when it is doing the wrong thing that I would like a -RULE BASED- hint. When applied, the optimiser would attack the query based on the order of the tables, joins and predicates. Oracle used to have such a thing and it was almost always preferable to the Cost based optimiser because it led to predictable query plans that wouldn’t suddenly stop performing as your data changed or grew.

I know that MS has gone to the trouble of writing a cost based optimiser for people who can’t write sensible SQL but it thinks it is so smart that when it does the wrong thing, it can be very hard to persuade it to change its mind.

<=-=Apr 12 2013 1:23AM=-=>

For example; I have a table with bill details in. About 200,000,000 records. I loaded a new bill into it. The bill had 2,000,000. After I load it, I check that the correct dollar value is loaded by running something like this

select sum(Dollars) from BigTable where BillID = 123456

This actually performs quite well, it correctly uses the index on BillID to identify the appropriate records even though the stats have no information about the Bill that I just loaded.

But then a few days later, after the stat refresh job that I run has been around and done its work, the same query performs appallingly. The query plan knows that there are 2,000,000 records to return and decides to scan the primary key which is on a whole different unrelated field. This takes FOREVER. Now I have to persuade SQL Server to use the index on BillID for a very simple query that looks for a single BillID.

(One way to do this is to parameterise with a variable @BillID in the where clause then the optimiser does not know which BillID it is getting and optimises for an average bill. The average bill is much smaller than 2,000,000 records).

A simple rule based optimiser would use the index on BillID when searching for a single bill. Every time! No queries sometimes taking 10 seconds and sometimes taking 20 minutes.

<=-=Mar 25 2014 6:51AM=-=>

It’s a shame MS haven’t decided to go with this. You could go with Paul White’s argument that the places for materialisation are TVFs and temp tables but that condemns us to a world in which Views can never go beyond a certain complexity without fracturing their definition into a mess of over-verbose TVFs.

Views, with their server centrality and set based efficiencies are potentially a great place to program a lot of detailed business logic that’s often done at the app layer at the moment, but refusing to allow key tuning options like this sells them short.

<=-=Dec 11 2014 6:18AM=-=>

Same thing here. I have the following ;with mainData (column1,…)
( SELECT col1, …
FROM tableA
JOIN tableB
ON ..
JOIN tableC
blah blah )
select col1, …
from mainData CTE

LEFT JOIN vw_tableX rtg
ON rtg.col1 = CTE.col1 and rtg.col2 = CTE.col2

When I execute the first CTE as a temp table and then do a separate statement table join with the vw_tableX then it runs in 50secs. Running the statement as is takes close to 1h. I need to a way to have the sql run faster instead of splitting into 2 statements. I have tried options with TOP (2147483647) but doesn’t do the trick. Pls provide as an optimiser option. The optimiser is very smart but not always…

<=-=Apr 17 2015 4:48AM=-=>

If MS do provide a hint to force materialization of CTEs, it would provide a simple workaround for the problem with non-deterministic functions in CTEs raised by Itzik Ben-Gan in Connect Bug #350485.
Ideally, the optimizer should detect the use of potentially side-effecting functions in a CTE and materialize the intermediate results automatically.

<=-=May 12 2015 7:05PM=-=>

I work a lot with SSRS where you tend to need to load a ton of logic into a single SELECT, so I run into this issue a lot. As others have noticed the TOP workaround is not always effective.

The only reliable solution I’ve found is to convert the CTE to a Multi-Step Table Valued function with a Primary Key. I got this idea from Bogdan Sahlean’s answer:

It request a schema change, involves a lot of repetitive coding and leaves you with an inflexible architecture, so I resent every time I have to churn through this technique. But it does work with stunning performance benefits.

From my testing, the Primary Key component is critical to performance. If I cant see an obvious one I create one through concatenating columns (yet more code …).

It’s all insanely frustrating – while I hope that it does get resolved in some future “vNext”, after almost a decade I’m not holding my breath. Whatever – the slowing pace of peoples SQL upgrades means I’m resigned to slogging through this for years to come…


<=-=May 12 2015 7:06PM=-=>

Typo in para #3: “It request …” should read “It requires …”


Sign in
Sign in with: Microsoft
Signed in as (Sign out)
  • Anonymous commented  ·   ·  Flag as inappropriate

    Imo, MS should provide more direct means to exactly control execution plans because there are cases where the engine's AI approach proves ultimately proves itself to be inferior (even with updated stats, etc).

    Also, some use cases favor locked down query plans with highly predictable run times. Even though these run times would change relative to the underlying table sizes and perhaps become non-ideal, at least the run times wouldn't vary wildly due to plan changes due to row count thresholds.

    As part of a future materialization hint, being able to specify optional index key(s) would be appreciated.

  • Branko Dimitrijevic commented  ·   ·  Flag as inappropriate

    I think this should happen automatically when the same subtree figurates multiple times in the query plan. The more expensive the subtree execution is and the smaller its result is, the more likely it should be for the query optimizer to put a spool in front it and run the subsequent executions by just reading the spool.

    I'm not sure if transaction isolation has a role here, but this optimization should at least be safe under SNAPSHOT isolation.

    A hint could be there to nudge the optimizer in or out of this behavior, but should not be required when this is a clear win.

  • Justin Grant commented  ·   ·  Flag as inappropriate

    This is an important problem to solve. It's too hard to get predictable performance for expensive queries without intermediate materialization. In addition to a "hard" hint to force materialization, should there also be a "soft" hint that tells the optimizer to avoid running the subquery/CTE more than once? A soft hint could allow the optimizer the flexibility to decompose or rearrange the subquery or CTE as long as its I/O is only run once.

Feedback and Knowledge Base