CTE - Intermediate Results - Yes or No

  • Hello,

    I am currently reading a book on SQL2K5 which also covers CTEs.

    Originally I thought that CTEs were optimizing your queries by generating intermediate results for the SQL contained in its body.

    I learned soon afterwards that this was NOT the case. Instead, CTEs are a logical construct only. They dont affect physical execution explicitly.

    Now just a few days ago I got this new book on SQL2K5 and in this book, it is mentioned that CTEs do indeed generate intermediate results (at least sometimes).

    Now I still kind of believe that CTEs are only logical (the execution plans at least indicate this) but I wanted to confirm with the pros here at SQLServerCentral :o)

    Thanks!

    Best Regards,

    Chris Büttner

  • Hi Christian,

    CTEs are 'physical' rather than 'logical' construct, as they produce data for use in subsequent query, right? So you can write a CTE query that performs heavy sorting or aggregation - this deffinetely requires additional storage, like in tempdb. CTEs do simplify many queries but you cannot treat them as purely logical entities. You may consider them as sort of temporary tables, usable within scope of the following query.

    🙂

    Piotr.

    ...and your only reply is slàinte mhath

  • Hello Piotr,

    Thanks for your response. If this is really the case that they can be thought of (loosely speaking) as temp tables, the following query plan doesn't fit into this scheme (if I did not make a mistake):

    /* Execute the following Statements prior to executing the CTE

    SELECT * INTO #sysobjects FROM sys.objects

    SELECT * INTO #sysviews FROM sys.views

    SELECT * INTO #syscolumns FROM sys.columns

    CREATE UNIQUE CLUSTERED INDEX UCI_sysobjects ON #sysobjects(object_id)

    CREATE UNIQUE CLUSTERED INDEX UCI_sysviews ON #sysviews(object_id)

    CREATE UNIQUE CLUSTERED INDEX UCI_syscolumns ON #syscolumns(object_id, column_id)

    */

    WITH MyCTE AS (SELECT so.* FROM #sysobjects so, #sysviews sv WHERE so.name LIKE '%a%'AND sv.object_id > 30 and sv.object_id = so.object_id)

    SELECT*

    FROMMyCTE

    INNER JOIN #sysviews v ON MyCTE.name = v.name

    INNER JOIN #syscolumns c ON MyCTE.object_id = c.object_id

    The query plan is as follows:

    Table v (outside CTE) is first joined to Table so (inside CTE).

    Then sv (inside CTE) is joined to this.

    So the CTE doesn't seem to be evaluated first. Instead the joins happen across CTE boundaries.

    It looks like it is solely the optimizers choice unless you specify query hints.

    I have attached the query plan for your reference.

    Thanks!

    Best Regards,

    Chris Büttner

  • If you read Itzik Ben-Gan's book on SQL Querying, he lumps CTE's and derived tables together because they are so similar in behavior (as you're seeing in the execution plans for the table you're working with). One of the main differences, and actually a big advantage, is that a CTE can be referred to repeatedly within the following SQL statement unlike a derived table that would have to be redefined each time you needed to use it. And, as everyone knows, CTE's work with recursion.

    Other than that, it's still just a TSQL query and the optimizer is going to make largely the same types of choices that it would make if you were dealing with a derived table.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Absolutely spot on, Grant!

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • So you agree that they (nonrecursive CTEs) are no different than derived tables from a query processing perspective?

    So when BOL states "Specifies a temporary named result set, known as a common table expression (CTE)"

    this named result set is just a logical result set, correct?

    ms-help://MS.SQLCC.v9/MS.SQLSVR.v9.en/tsqlref9/html/27cfb819-3e8d-4274-8bbe-cbbe4d9c2e23.htm

    And thanks for your input!

    P.S. I already read Itziks books, thanks to books24x7 (I hope it's allowed to mention this. I am not affiliated with this company in any way)

    Best Regards,

    Chris Büttner

  • Well, it depends. Most of the time, yes, but watch for Table Spool operators in the query plan as a way to spot interim tables being built. You'll definitely see this in a recursive CTE, but I wouldn't be surprised to see it elsewhere, depending on what the CTE is doing and how it's used. As clear as mud right?

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Actually I think it is clear for me now.

    As mentioned before it is the task of the query optimizer to play around with the order of execution based on stats etc. A CTE just doesn't give you a "specific" execution plan (again leave out recursive CTEs)

    Thanks for the valuable input from all of you!

    And if I still am wrong, please hit me 😉

    Best Regards,

    Chris Büttner

  • Btw: OPTION (FORCE ORDER) would give you a plan that evaluates the CTE first, but that also affects the plan for the query inside the CTE I assume. So this should also be no good idea.

    Best Regards,

    Chris Büttner

  • ... One of the main differences, and actually a big advantage, is that a CTE can be referred to repeatedly within the following SQL statement unlike a derived table that would have to be redefined each time you needed to use it. ....

    Please alow me to add "And the SQLserver engine is aware of that".

    Meaning a suppose your derived table is specified more than once in your query, sqlserver handles them individualy. With CTE, sqlserver knows it is handeling the same basic subset of data and that's why it will be able to handle it more efficiently.

    Basic role No 1 for any RDBMS: tell your engine what you know.

    So start using CTE's even if it's just a single replacement of a derived table.

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply