I wrote about a basic PIVOT query recently. It’s an interesting way to write a query and turn row data into columns. That’s handy, and lots of people have a need for it. However I’d never used PIVOT in production code. I’ve always written a crosstab query instead.
If I look back at the query I wrote, it looks like this:
select * from ( select team , opponent , teamscore from scores results ) as rawdata pivot ( avg(teamscore) for [Opponent] in ( [KC], [OAK], [SD] ) ) as pivotresults;
This gives me these results:
team KC OAK SD
DEN 31 35 24
However I could also write this query:
select 'team' = team , 'KC' = sum( case when Opponent = 'KC' then TeamScore else 0 end) / 2 , 'OAK' = sum( case when Opponent = 'OAK' then TeamScore else 0 end) / 2 , 'SD' = sum( case when Opponent = 'SD' then TeamScore else 0 end) / 2 from scores s where team = 'DEN' group by team
That gives me the same results. The AVG item gets tricky as can don’t want zeros to be included in results. If there were a dynamic number of scores, I’d have to write a few subqueries to solve this issue.
Which one is more clear and easier to understand? That’s a good debate. However I will say that if I need to add a column, I think it’s easier to do so in the crosstab.
select 'team' = team , 'KC' = sum( case when Opponent = 'KC' then TeamScore else 0 end) / 2 , 'OAK' = sum( case when Opponent = 'OAK' then TeamScore else 0 end) / 2 , 'SD' = sum( case when Opponent = 'SD' then TeamScore else 0 end) / 2 , 'NE' = sum( case when Opponent = 'NE' then TeamScore else 0 end) from scores s where team = 'DEN' group by team ;
Here’s the PIVOT:
select * from ( select team , opponent , teamscore from scores results ) as rawdata pivot ( avg(teamscore) for [Opponent] in ( [KC], [OAK], [SD], [NE] ) ) as pivotresults;
I’ll also point out that Jeff Moden has written a few articles on this subject (Part 1 and Part 2) and his performance analysis shows that a crosstab performs better in almost all cases.
I wouldn’t recommend one over the other. You need to test them both at larger than expected data sets to determine which one works in your situation.
Filed under: Blog Tagged: syndicated, T-SQL