Debugging stacked CTEs efficently

Question

Debugging stacked CTEs efficently

Banana-823045

SSCommitted

Points: 1689
More actions
March 20, 2013 at 1:16 pm

#275480

I'm curious if anyone else has figured out a good system for performing analysis on a query that may have multiple CTEs stacked. While some problems can be resolved by considering the T-SQL and making logical evaluations, sometime it's nice to be able to see the intermediate data so you can see what is really going on under the hood.
When I have such query, I have the following steps:
1) add a /* and */ delimiter on the bottom part of the query to comment out all subsequent steps
2) comment out the line after the closing parenthesis of last CTE to make the next CTE a outer statement
3) run the query, analyze the data
4) uncomment the comment from #2, move the /* to the next CTE and comment out the next line after the new closing parenthesis. Repeat until we reach the final statement.
This works but I wondered if there was a more effective and a bit less time-consuming & error-prone approach that would permit for data analysis of individual steps.
Thanks in advance!

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply

Vedran Kesegic SSCrazy Eights Points: 9302 More actions · Answer 1

Intermediate results sometimes can be calculated separately and stored to temp tables. Post your cte if you want more specific answer. There is no debugging tool that i am aware of, that will show you how sql internally executes the command (cte is executed as a SINGLE command) and show you the intermediate data. Execution plan is closest to it.

_____________________________________________________
Microsoft Certified Master: SQL Server 2008
XDetails Addin - for SQL Developers
blog.sqlxdetails.com - Transaction log myths

Orlando Colamatteo SSC Guru Points: 182276 More actions · Answer 2

Banana-823045 (3/20/2013)
I'm curious if anyone else has figured out a good system for performing analysis on a query that may have multiple CTEs stacked. While some problems can be resolved by considering the T-SQL and making logical evaluations, sometime it's nice to be able to see the intermediate data so you can see what is really going on under the hood.

That would require that CTEs be a fundamentally different thing than what they are. The optimizer is handed a CTE by you, the query writer, but that is just a description of the data you want in the form of a question, it is not an actual set of native instructions the database engine can use to retrieve the data. The optimizer will take your query and change it into a native set of instructions. We can get a view into these instructions by looking at the execution plan. In SSMS highlight your query and press Ctrl+L to see one.

Here is an example. Consider this simple query to show all columns in a database:

SELECT s.name AS [schema_name],

t.name AS table_name,

c.name AS column_name

FROM sys.schemas s

INNER JOIN sys.tables t ON s.[schema_id] = t.[schema_id]

INNER JOIN sys.columns c ON t.[object_id] = c.[object_id]

ORDER BY s.name,

t.name,

c.column_id;

Now consider a logically equivalent query that uses a series of three cascaded CTEs:

WITH schema_cte

AS (

SELECT [schema_id],

name AS [schema_name]

FROM sys.schemas

),

table_cte

AS (

SELECT s.[schema_id],

s.schema_name,

t.[object_id],

t.name AS table_name

FROM schema_cte s

INNER JOIN sys.tables t ON s.[schema_id] = t.[schema_id]

),

column_cte

AS (

SELECT t.schema_name,

t.table_name,

c.name AS column_name,

c.column_id

FROM table_cte t

INNER JOIN sys.columns c ON t.[object_id] = c.[object_id]

)

SELECT [schema_name],

table_name,

column_name

FROM column_cte

ORDER BY [schema_name],

table_name,

column_id;

The queries look very different, however they will deliver the same results. Further, when we look at the execution plans of each we see that the optimizer actually decided to execute them in the exact same way.

So, as you can see, it's not as if the database engine processes your query in steps where it materializes the first CTE, then takes that intermediate result and uses it to materialize the second CTE, and so on. It reworks your query, sometimes many thousands of different ways, looking for an efficient way to execute your query.

Regarding debugging, what you have showed in terms of steps is how you would debug a query with cascading CTEs. If you want a debugging experience where you can see the intermediate results consider using #temp tables.

There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato