CTE and Spool operators

  • I commonly find CTE using spool operators in the execution plan. According to Microsoft the spool operator uses the temp table.

    "The Table Spool operator scans the input and places a copy of each row in a hidden spool table that is stored in the tempdb database and existing only for the lifetime of the query. If the operator is rewound (for example, by a Nested Loops operator) but no rebinding is needed, the spooled data is used instead of rescanning the input."

    So this leads to believe its possible that CTE's are using tempdb? so are the results stored in memory or tempdb? or does is DEPEND?

  • Read this http://www.sqlservercentral.com/blogs/scarydba/2009/09/09/spools-in-execution-plans/

    If spooling happens, it will show in the execution plan.

    https://sqlroadie.com/

  • If anyone is curious like most things it depends..

    If the CTE is using recursion then its uses a spool operator therefore tempdb is used....therefore not much performance benefit vs just creating a temp table

    If the CTE is non recursive then then the main qry is repeated and performance is that of using a cursor..the main anchor query is repeated therefor performance is no good!

    http://sqlblogcasts.com/blogs/tonyrogerson/archive/2008/05/11/common-table-expressions-cte-s-how-it-works-how-recursion-works-using-with-adjacency-list.aspx

  • Do you have the process that you are looking at as it might give people some pointers as to why this is happening.

    with my limited knowledge tempdb is used to hold worktables and so the CTE could be spilling results to the worktables when it needs to make way for new data coming in especially if the CTE has a DISTINCT, Group by or UNION in it.

    _________________________________________________________________________
    SSC Guide to Posting and Best Practices

  • A spool operator is used to save intermediate query results in a work table in tempdb. CTE will temporary store these results in these spools.

    Although it will use tempdb, it will most probably use tempdb pages in memory anyway.

  • 456789psw (4/10/2013)


    CTE's seem to be using tempdb just as using a temp table or table variable.

    CTEs are nothing more than a named subquery. They are not temp tables or table variables.

    Just like a normal subquery, they can include spools or objects that spill to TempDB, if they do, they will use Tem[DB, if they don't, then they won't use TempDB. No different from if you created a view instead of using a CTE or used a subquery instead of the CTE.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • 456789psw (4/10/2013)


    If anyone is curious like most things it depends..

    If the CTE is using recursion then its uses a spool operator therefore tempdb is used....therefore not much performance benefit vs just creating a temp table

    If the CTE is non recursive then then the main qry is repeated and performance is that of using a cursor..the main anchor query is repeated therefor performance is no good!

    http://sqlblogcasts.com/blogs/tonyrogerson/archive/2008/05/11/common-table-expressions-cte-s-how-it-works-how-recursion-works-using-with-adjacency-list.aspx%5B/quote%5D

    ??????

  • 456789psw (4/10/2013)


    If the CTE is using recursion then its uses a spool operator therefore tempdb is used....therefore not much performance benefit vs just creating a temp table

    How exactly would you replace a recursive CTE with a temp table?

    If the CTE is non recursive then then the main qry is repeated and performance is that of using a cursor..the main anchor query is repeated therefor performance is no good!

    If the CTE is non-recursive, it doesn't have an anchor query (anchor query is only for recursive CTEs), nor is there any query to repeat, I'm not sure what you're getting at here.

    A non-recursive CTE is just a named subquery or temporary view, whichever you prefer to imagine it as, and performs as such.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Hi Gail,

    I originally was trying to find out if CTE's use TEMPDB? I was always told to think of them as you said.

    "CTEs are nothing more than a named subquery."They are not temp tables or table variables."

    But doing more reading, its seems just because they are named subquery does not mean a temp table(worktable) is not created in the background. It looks to me that table spools are used for recursive CTE there for a spool operator is used.

    Thanks sorry for the confusion

    http://blogs.msdn.com/b/craigfr/archive/2007/10/25/recursive-ctes.aspx

  • 456789psw (4/10/2013)


    But doing more reading, its seems just because they are named subquery does not mean a temp table(worktable) is not created in the background.

    True, but the same goes for any query whatsoever in SQL. A query with no CTE can have a worktable, a query with a subquery can have a worktable, a query that uses a view can have a worktable. Nothing specific to CTEs there. A query with a CTE does not automatically use TempDB any more than any other query in SQL.

    It looks to me that table spools are used for recursive CTE there for a spool operator is used.

    Yup, spools are used with recursive CTEs, as well as in other places as the optimiser decides (like some update queries)

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • 456789psw (4/9/2013)


    I commonly find CTE using spool operators in the execution plan. According to Microsoft the spool operator uses the temp table.

    "The Table Spool operator scans the input and places a copy of each row in a hidden spool table that is stored in the tempdb database and existing only for the lifetime of the query. If the operator is rewound (for example, by a Nested Loops operator) but no rebinding is needed, the spooled data is used instead of rescanning the input."

    So this leads to believe its possible that CTE's are using tempdb? so are the results stored in memory or tempdb? or does is DEPEND?

    I like to in my own mind to separate the entire concept of CTE's from the eventual execution of the query, like what is described in programming language translation textbooks. In my own invented theory of SQL server internal workings, the query starts as a simple stream of text we type in, which is then split into keywords, object references and operators (with delimiters helping us along in splitting this text into keywords, references and operators (or "tokens" in interpreterspeak)).

    Next, these keywords, references and operators are put together into syntactical constructs that are "generic" in nature with parameters, like <KEYWORD-SELECT>,<COLUMN>,[(optionally zero or more occuring instances of <COMMA-SEPARATOR><COLUMN>]<KEYWORD-FROM><TABLE-OR-VIEW-OR-CTE-OR-WHATEVER> etc etc and from this we have a symbolic representation of operations we want the server to execute.

    Now AT THIS POINT the server can make logically equivalent substitutions (and will evaluate these substitutions using rankings and including statistics and calculations that can be derived by data stored concerning these objects). Often at this point, CTE's could be eliminated entirely as they could be replaced by syntactical constructs that the original text has described during the original typing of the query, or these could have been completely replaced by logically equivalent constructs. Programmingwise, at some point, the server has evaluated some number of these various logically equivalent substitutions (plans) and its going to have one that estimates to execute best according to statistics and will pass a compiled execution plan to the database execution engine (an invented name for some blob of software internal to SQL server).

    Now during the execution of this intermediate coded little programs (internal executable descriptions of those query plans (which is really probably just some sort of program in itself), temp results may spill. But I like to think that the decision in the server software to spill results to tempdb is SO FAR REMOVED from the phase of program translation that trying to link CTE's to whether a query will spill temp results to tempdb that its really sort of a moot point.

    Obviously, this is just a pipedream crackpot theory of mine, Microsoft could actually have a squirrel running on a treadmill for all I know, but regardless I like to think along those terms when thinking about questions like this.

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply