Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 

How many times the T-SQL inside the CTE is executed?

Introduction

Common Table Expression (CTE) has become very popular these days. However, many users of CTE still have a myth that the t-sql written inside the CTE will be executed only once irrespective of the number of times it is referred in the subsequent CTEs or the related query.
However, this is not correct. It is a pure misconception. And let’s try to prove this with a simple example.

Proof
The Query
;WITH CTE1 AS (
SELECT
* 
FROM
dbo.Employees
),CTE2 AS (
SELECT
*
FROM
CTE1
)
SELECT
C1.EmployeeID,
C1.Name,
C2.EmployeeID,
C2.Name,
C3.EmployeeID,
C3.Name
FROM
CTE1 C1
INNER JOIN CTE2 C2
ON C2.EmployeeID = C1.EmployeeID
INNER JOIN CTE1 C3
ON C3.EmployeeID = C1.EmployeeID

Execution Plan

CTE

Please have a look at the execution plan. It clearly shows that even though the t-sql to fetch the data from the table is written only once inside CTE1, the Scan is done thrice. This is because CTE1 is referred thrice in the final query -
  1. Two times CTE1 is directly used.
  2. Once CTE1 is used indirectly via CTE2.
Conclusion
The CTE should be thought of a view that is defined for the current query only. At the time of execution, the query optimizer will replace all the direct/indirect use of CTE with the actual query in the same way as it does for Views.

Comments

Posted by sherifffruitfly on 19 December 2011

Can you make recursive views?

Posted by Nakul Vachhrajani on 19 December 2011

The way I see CTEs is that they are a good way to write code where you absolutely need a sub-query/view. They make such code much more readable, and of course, come with additional benefits like having the ability to recurse.

Other than that, I avoid CTEs as far as possible.

Thank-you for your research, and for taking the time out to document this.

Posted by mozman on 19 December 2011

Why do you avoid using CTE's?  I use them often.  They seem to be much more efficient than using temp tables or table variables.

Posted by bphipps-931675 on 19 December 2011

My personal experience is just the opposite.  When I tried to use CTE's they were much slower than temp tables.  

Posted by al_kessler on 19 December 2011

That is anoying.  I always thought the results were kept in memory and resued.  I converted this to Adventureworks (HumanResources.Employee) and LoginId instead of Name and ran several times with statistics as both CTEs and direct reference to the table.  The first time the CTE ran it was 62ms but after 4 trials was down to the consistent 31ms of the direct reference.  I still will use the CTE as structure a large query and limit records processed but will pay closer attention to processing plan.  

Posted by colin.counsell on 19 December 2011

I haven't tested this but let me run through this theory with you. When SQL builds it's execution plan it tries to work out the most efficient way of executing the query. In this example, I'm guessing that SQL thinks it will be quicker to ignore the CTE because it can get the data quicker using the 'Index scan' from the table. If you remove all indexes from the table and re-run it I wouldn't be suprised it it only hits the database once. Alternatively, if you include a filter in CTE1 to limit the data to a small subset (most likely how a CTE would be used) then again it would only hit the db once.

Posted by tmagney on 19 December 2011

Seems to me your example only shows that the data returned by the CTE queries are scanned three times due to your query plan. However, can you prove that the queries against dbo.Employees are actually executed three separate times? My understanding is that when using a CTE the results are held in memory. So the CTE data in memory may be scanned three times, but I suspect that the data is only read once from dbo.Employees.

Posted by SAinCA on 19 December 2011

I had a situation where I pulled data by month from several 100K rows.  The aggregates needed day-on-day, week-on-week, month-on-month, quarter-on-quarter and year-on-year comparisons against prior year and in-year.  This resulted, in SS2008EE, in a 3-level set of 2 CTEs and final SELECT, the results boiling down to just 8 rows.  Developing the queries in stages, the first CTE grabbed the base data and projected aggregations and identifiers, running in about 2 seconds.  Level 2 took level 1 and did further massaging - 8 seconds.  The final SELECT projected just 8 rows - over 2 minutes!

Frustrated and stunned by the plan and the evident failure of the optimizer to deduce it only needed the base data once, I converted the entire query to a SELECT ... INTO @TV and handled the rest myself. FOUR SECONDS.

Yes, there's a MAJOR flaw in the way nested CTEs are viewed and optimized.  Be very, very careful...  And bug Microsoft to apply greater intelligence in the optimizer via Connect!

Posted by vinaypugalia on 19 December 2011

@sherifffruitfly - no we cannot have recursive views.

@tmagney - Your logic is good but in reality it does not work out that way. Reasons being -

1. You can see the INDEX SCAN taking place for all three.

2. Run this code in your SSMS and you can see that each one of the scan consumes same amount of IO.

Posted by sturdyp on 20 December 2011

@colin.counsell - Yes, SQL server will try and determine the best method of executing the query.  No, it doesn't infer anything from your use of CTEs.  The CTE (other than allowing recursion) is only for our benifit, not SQLs.

@tmagney - Doesn't matter how the data is read, the first time a page is accessed, it is loaded into memory.  It stays there until SQL determines it needs the memory for something else.  So yes, if you monitored the physical disc IO the data would only be read once. (Assuming the server has enough memory), but that would have nothing to do with the CTE.  It would be the same if you repeated the same sub-query three times.

Leave a Comment

Please register or log in to leave a comment.