In SQL Server 2005, Common Table Expressions (CTEs) can be of enormous value in generating scripts that create frequent reports. The creation of routine reports for management and analysis can involve the creation of complicated SELECT statements. But, as reports may often need to be changed as the business situation changes, it is important that the code used to create the report be easily maintained and easily readable. CTEs can make many complicated SELECT statements far more readable and will also simplify certain task such as concatenation and organizing by percentile.
The basic syntax for CTEs was described beautifully by Srinivas Sampath in his article Common Table Expressions in SQL Server 2005 ( http://www.sqlservercentral.com/articles/Development/commontableexpressionsinsqlserver2005/1758/ ) One of the important things he points out is that it is possible to define multiple CTEs for use in a single statement, so long as they are separated by commas. When defining multiple CTEs, a later one can reference an earlier one, allowing each successive one to build upon or make use of the last one if desired, but they cannot forward reference.
One example of using multiple CTEs to make a query more comprehensible and therefore easier to maintain and modify can be taken from the Pubs database. This code lists each author or set of authors by name, along with the number of titles for that set of authors and the number sold:
with AuthorNameCTE as ( --Provides the full number and author number for each author of a title
A.au_fname + ' ' + A.au_lname as AuthorName,
join authors A
on TA.au_id = A.au_id
CompleteAuthorCTE as ( --Concatenates the string where there is more than one author for a book
stuff((select '; '+AN2.AuthorName
from AuthorNameCTE AN2
where AN.title_id = AN2.title_id
Order by au_ord
for xml path('')),1,1,'') as Authors
select --Provides a report of each distinct set of authors with their number of titles and total sales.
count(1) as NumTitles,
sum(YTD_sales) as TotSales
join Titles T
on T.title_id = C.title_id
This code generates the concatenation function separately form all other pieces. This query, like virtually all other possible examples, could be rewritten as a single select statement such as :
count(AL.title_id) as NumTitles,
sum(T.ytd_sales) as TotSales
'; '+A2.au_fname + ' ' + A2.au_lname
join authors A2
on TA2.au_id = A2.au_id
where TA3.title_id = TA2.title_id
order by au_ord
for xml path('')),1,1,'') as authors
join authors A3
on TA3.au_id = A3.au_id
group by TA3.title_id) AL
join titles T
on T.title_id = AL.title_id
However, though it is shorter in an absolute sense, it is less descriptive. It is, generally, more difficult to create, read, and maintain than the version using the CTEs.
In certain cases, particularly when using recursive queries, the CTE solution will be slower than one which eschews CTEs. Pete He elaborates on this fact in his article CTE Performance ( http://www.sqlservercentral.com/articles/T-SQL/2926/ ). This can certainly be a major issue, but in many cases there will be no performance degradation. In the simple sample being considered, the two queries generate exactly the same execution plan.
Even when using recursive queries, it will often be the case that the ease in development and ease in code maintenance that using a CTE can bring will be worth the performance price. In fact, one of the prime benefits of CTEs is that they make it far easier to both read and write recursive queries. The value of making the code more readable will be especially obvious when writing code to generate reports. It is common for a report to be generated relatively infrequently, but for modifications to be made to the report fairly often as the business needs change. It is also common for a business to maintain several very similar but not identical reports. In that case, maximum code reuse can be achieved by making the first one generated as readable as possible.
Another example of a report in which the CTE makes it much more readable is in generating the percentile for a given property. This requires identifying the percentile for each unique possible value of the property and then joining that with the base table. For instance, to place the year to date sales for the books in the titles table you could use:
with PercentileCTE as
(100 / C.Num) * (row_number () over (order by ytd_sales desc) -.5) as percentile
(select count(distinct ytd_sales) as Num from dbo.titles) C
join PercentileCTE P
on P.ytd_sales = T2.Ytd_sales
This can be turned into a direct query, most simply by simply moving the first select statement to a correlated subquery within the main query instead of as a CTE, however it is conceptual easier to follow and easier to read when that is handled as a separate item and then joined in naturally.
Overall, CTEs are effective at making queries, especially complex queries used for reporting, easier to both read and write. In turn, making the queries within the procedures easier to read will make them easier to maintain. Larry Wall stated in the Perl documentation, "Perl is designed to give you several ways to do anything, so consider picking the most readable one." This applies equally well to virtually every other aspect within computers, and CTEs are one way of achieving this goal.
CTE Performance by Peter He ( http://www.sqlservercentral.com/articles/T-SQL/2926/ )
Common Table Expressions in SQL Server 2005 by Srinivas Sampath ( http://www.sqlservercentral.com/articles/Development/commontableexpressionsinsqlserver2005/1758/ )
Recursive Queries in SQL Server 2005 by Srinivas Sampath ( http://www.sqlservercentral.com/articles/Development/recursivequeriesinsqlserver2005/1760/ )
Percentile on Wikipedia ( http://en.wikipedia.org/wiki/Percentile )
Performance Tuning: Concatenation Functions and Some Tuning Myths by Jeff Moden ( http://www.sqlservercentral.com/articles/Test+Data/61572/ )