RE: CTE and Spool operators – SQLServerCentral

SSC-Insane

Points: 23660

April 10, 2013 at 2:51 pm

456789psw (4/9/2013)
I commonly find CTE using spool operators in the execution plan. According to Microsoft the spool operator uses the temp table.
"The Table Spool operator scans the input and places a copy of each row in a hidden spool table that is stored in the tempdb database and existing only for the lifetime of the query. If the operator is rewound (for example, by a Nested Loops operator) but no rebinding is needed, the spooled data is used instead of rescanning the input."
So this leads to believe its possible that CTE's are using tempdb? so are the results stored in memory or tempdb? or does is DEPEND?

I like to in my own mind to separate the entire concept of CTE's from the eventual execution of the query, like what is described in programming language translation textbooks. In my own invented theory of SQL server internal workings, the query starts as a simple stream of text we type in, which is then split into keywords, object references and operators (with delimiters helping us along in splitting this text into keywords, references and operators (or "tokens" in interpreterspeak)).

Next, these keywords, references and operators are put together into syntactical constructs that are "generic" in nature with parameters, like <KEYWORD-SELECT>,<COLUMN>,[(optionally zero or more occuring instances of <COMMA-SEPARATOR><COLUMN>]<KEYWORD-FROM><TABLE-OR-VIEW-OR-CTE-OR-WHATEVER> etc etc and from this we have a symbolic representation of operations we want the server to execute.

Now AT THIS POINT the server can make logically equivalent substitutions (and will evaluate these substitutions using rankings and including statistics and calculations that can be derived by data stored concerning these objects). Often at this point, CTE's could be eliminated entirely as they could be replaced by syntactical constructs that the original text has described during the original typing of the query, or these could have been completely replaced by logically equivalent constructs. Programmingwise, at some point, the server has evaluated some number of these various logically equivalent substitutions (plans) and its going to have one that estimates to execute best according to statistics and will pass a compiled execution plan to the database execution engine (an invented name for some blob of software internal to SQL server).

Now during the execution of this intermediate coded little programs (internal executable descriptions of those query plans (which is really probably just some sort of program in itself), temp results may spill. But I like to think that the decision in the server software to spill results to tempdb is SO FAR REMOVED from the phase of program translation that trying to link CTE's to whether a query will spill temp results to tempdb that its really sort of a moot point.

Obviously, this is just a pipedream crackpot theory of mine, Microsoft could actually have a squirrel running on a treadmill for all I know, but regardless I like to think along those terms when thinking about questions like this.