Common Table Expressions

Question

Common Table Expressions

Viewing 12 posts - 31 through 42 (of 42 total)

You must be logged in to reply to this topic. Login to reply

Steve Jones - SSC Editor SSC Guru Points: 740241 More actions · Answer 1

I think the biggest benefit for many people is that CTEs allow them to write complex SQL in an easier way. Moving to derived tables just seems harder for many people writing queries.

A CTE can be written first, meaning write part of your SQL that you need, then once you enclose it in the CTE formatting, you add it like any other table or view, without having to create those objects.

However as someone that's been writing T-SQL for over a dozen years, I'm not sure it's that much easier for me. It is slightly clearer in some cases, but for the most part I find relatively few queries where I'd use it. I think that's an experience thing

TimothyAWiseman SSCrazy Eights Points: 8819 More actions · Answer 2

This was a simple, clear, and well written article. The subject matter is a bit basic for someone who is familiar with the new features in 2005, but it is an excellent introduction to CTEs.

---
Timothy A Wiseman
SQL Blog: http://timothyawiseman.wordpress.com/

srienstr SSCrazy Points: 2435 More actions · Answer 3

69.5% of the total batch cost for the derived table method according to the execution plan, but only 31.8% of the runtime (according to the getdate() queries I inserted at various steps)... I'm confused. I'd assumed that cost and runtime were directly correlated unless there were competing processes, but it seems to occur consistently, regardless of which query is first in the batch.

Edit: There is a competing load going on, which is probably what threw off how the cost compares to the execution time due to the competition for CPU, RAM and I/O.

Puto me cogitare, ergo puto me esse.
I think that I think, therefore I think that I am.

jdoconsulting Grasshopper Points: 11 More actions · Answer 4

Does anyone have a good example of using a CTE to rollup assembly costs in a nested Bill of Materials? It seems to me that this should be done from the bottom of the assembly upwards. Most examples I've found deal with rolling up salaries for employees in an organization. The difference with a BOM is that an assembly can be part of multiple assemblies so there isn't just one path in the hierarchy like there is with an employee in an organization.

GSquared SSC Guru Points: 260824 More actions · Answer 5

jdoconsulting (3/18/2008)
Does anyone have a good example of using a CTE to rollup assembly costs in a nested Bill of Materials? It seems to me that this should be done from the bottom of the assembly upwards. Most examples I've found deal with rolling up salaries for employees in an organization. The difference with a BOM is that an assembly can be part of multiple assemblies so there isn't just one path in the hierarchy like there is with an employee in an organization.

Not sure how you would do that. If you have 2-inch #10 bolts as one of you assembly parts, how do you roll that up? It might be used in hundreds of different end products.

On BoM hierarchies, I usually roll them down. Start at the final product, or the sub-assembly that you want data for, and go down from there.

If you really want to go from the bottom up, you'd just reverse the join from the Books Online example. Instead of joining the Parent ID in the table to the ID in the CTE, join the ID in the table to the Parent ID in the CTE. That gives you a bottom-up hierarchy.

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

GSquared SSC Guru Points: 260824 More actions · Answer 6

srienstr (3/18/2008)
69.5% of the total batch cost for the derived table method according to the execution plan, but only 31.8% of the runtime (according to the getdate() queries I inserted at various steps)... I'm confused. I'd assumed that cost and runtime were directly correlated unless there were competing processes, but it seems to occur consistently, regardless of which query is first in the batch.
Edit: There is a competing load going on, which is probably what threw off how the cost compares to the execution time due to the competition for CPU, RAM and I/O.

Estimated Cost does not neccessarily equal Percentage of Runtime

It sometimes does, and sometimes is close, but not always. Cost isn't just an estimate of how long something will take. It's also an estimate of how much CPU time it will take, how many I/O cycles, how many mathematical computations, how many string functions, etc. It's meant to be a numeric representation of how much effort the server is going through to accomplish something. If it were meant to correlate directly to time, it would be an estimated execution time field, not an estimated cost.

As an aside, you're better of using "set statistics time on", than adding "select getdate()" clauses to your query. It's more accurate since the select getdate() query can, itself, add time to the runtime of the query.

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

Jeff Moden SSC Guru Points: 1004683 More actions · Answer 7

And then you better make sure you run the code more than once, even on a quiet system...

DECLARE @Year INT

SET @Year = 2008

SET STATISTICS IO ON

SET STATISTICS TIME ON

;WITH cteDates AS

(

SELECT DATEADD(yy,@Year-1900,0)+Number AS TheDate

FROM Master.dbo.spt_Values

WHERE Type = 'P'

AND Number <= 366

)

SELECT cd.TheDate, DATEPART(qq,TheDate) AS TheQuarter

FROM cteDates cd

WHERE YEAR(TheDate) = @Year

SET STATISTICS TIME OFF

PRINT REPLICATE('=',100)

SET STATISTICS TIME ON

SELECT DATEADD(yy,@Year-1900,0)+Number AS TheDate,

DATEPART(qq,DATEADD(yy,@Year-1900,0)+Number) AS TheQuarter

FROM Master.dbo.spt_Values

WHERE Type = 'P'

AND Number < 366

AND YEAR(DATEADD(yy,@Year-1900,0)+Number) = @Year

SET STATISTICS TIME OFF

SET STATISTICS IO OFF

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

srienstr SSCrazy Points: 2435 More actions · Answer 8

Jeff Moden (3/18/2008)
And then you better make sure you run the code more than once, even on a quiet system...

I did three runs in one order, and two runs with the order switched, the % cost matched, though the % time did vary slightly. The issue that threw me off was that an I/O intensive load was running at the same time, so the I/O portion of the costs was actually taking much more time relative to the CPU and RAM portions.

Then again, it may just be that the cost formula is not properly balanced for our server for some reason that isn't apparent to me.

Puto me cogitare, ergo puto me esse.
I think that I think, therefore I think that I am.

Jeff Moden SSC Guru Points: 1004683 More actions · Answer 9

Did you look at the "Message" window? Sometimes one or the other runs in half the time as the other. Some folks look at something like that on just one run and try to take it to the bank thinking (like writing company SQL standards) that one or the other is better... bad mistake a lot of times.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

GSquared SSC Guru Points: 260824 More actions · Answer 10

A script I posted on another thread in this forum illustrates why I like the layout and readability of CTEs.

The thread is at http://www.sqlservercentral.com/Forums/Topic470379-338-1.aspx

The script is:

;with

CTE1 (ID, Position, Val) as

(select id, numbers.number, substring(reverse(rtrim(clean)), numbers.number, 1)

from dbo.numbercleanb

inner join dbo.numbers

on numbers.number between 1 and len(clean)),

CTE2 (ID, Pos, Val) as

(select id, row_number() over (partition by id order by position), val

from cte1

where val like '[0-9]'),

CTE3 (ID, Number) as

(select id, sum(power(cast(10 as bigint), cast(pos-1 as bigint)) * cast(val as bigint))

from cte2

group by id)

select count(*)

from cte3

inner join dbo.numbercleanb

on cte3.id = numbercleanb.id

and cte3.number != numbercleanb.number

A "derived tables" version might look like:

select count(*)

from

(select id,

sum(power(cast(10 as bigint), cast(pos-1 as bigint)) * cast(val as bigint)) as number

from

(select id, row_number() over (partition by id order by position) as pos, val

from

(select id, numbers.number as position,

substring(reverse(rtrim(clean)), numbers.number, 1) as val

from dbo.numbercleanb

inner join dbo.numbers

on numbers.number between 1 and len(clean)) Sub1

where val like '[0-9]') Sub2

group by id) Sub3

inner join dbo.numbercleanb

on sub3.id = numbercleanb.id

and sub3.number != numbercleanb.number

The reasons I find the CTEs more readable are:

A) I don't have to keep increasing the indention as I increase the number of levels of subquery. This means I'm less likely to have to scroll left and right on the screen.

B) The column names in the derived tables are less visible than those in the CTEs.

C) If I need to debug or modify part of the query, the CTEs are self-contained and can be modified more modularly.

D) Each query, including the final/outer query, is all in one place. In the derived tables version, part of each subquery is above and part below, the inner subs. This is forced because I have to Select, From, Where for each subquery, and From comes before Where. In the CTEs, this isn't neccessary and it keeps each in one place.

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 11

GSquared,

Good example of how CTE's can help make some code easier to read and/or maintain.

😎

Julie Breutzmann Hall of Fame Points: 3637 More actions · Answer 12

Thanks, Kirk for a well-written explanation of CTEs.

Great discussion, too. Thanks!