SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 

Advice from Aunt Kathi

Add to Technorati Favorites Add to Google
November 2008 - Posts

Grouping Sets

By Kathi Kellenberger in Advice from Aunt Kathi 11-28-2008 4:00 PM | Categories: Filed under: , , ,
Rating: |  Discuss | 4,086 Reads | 240 Reads in Last 30 Days |3 comment(s)

I have given a talk about new data types and T-SQL features to three local groups.  I am presenting it again at the St. Louis Day of .Net on December 13th (http://stlouisdayofdotnet.com/).  Since Itzik Ben-Gan was giving a similar talk at PASS last week, I decided to attend his talk to get some additional insight on the topic.

Itzik’s favorite new T-SQL feature is the grouping sets. Before his talk, I didn’t really see any benefit over using UNION queries (examples below) which will give the identical results.  Itzik said that the new grouping sets performed better than UNION queries. When I had experimented with the new feature, I had found identical performance.  Rather than embarrass myself by questioning the T-SQL guru, I made a note to revisit this later.

Today I decided to play with the new feature to see for myself.  What I found was that if no WHERE clause was used, the performance was the same between the two techniques. The execution plans were almost identical with both queries performing two clustered index scans.  Adding a WHERE clause with a filter on the clustered indexed column changed things considerably, and now the grouping set query performed better. It had one clustered index seek while the UNION query had two.  I found that adding any WHERE clause, even one that resulted in a scan, improved the performance of the grouping set query.  

My examples had only had two grouping levels, and I suspected that adding more grouping levels would show an even more dramatic difference in performance. My hunch was correct. The performance was always better with the grouping set query as long as a WHERE clause was included.  The UNION queries had to access the table once for each level while the grouping set query would access the table just once.

Here are the queries I experimented with using the SQL Server 2008 AdventureWorks sample database.

TEST ONE, identical performance, each has a clustered index scan:

select salesorderid,SUM(unitPrice),ProductID
from Sales.SalesOrderDetail
group by grouping sets(SalesOrderID,ProductID)
go
select null as salesorderid,SUM(unitprice),productid
from Sales.SalesOrderDetail
group by ProductID
union
select salesorderid,SUM(unitprice),null
from Sales.SalesOrderDetail
group by SalesOrderID
-----------------------------------------------------------------------

TEST Two: Add a WHERE clause. Now the grouping set query has one clustered index seek while the UNION query has two:

select salesorderid,SUM(unitPrice),ProductID
from Sales.SalesOrderDetail
where SalesOrderID between 40000 and 50000
group by grouping sets(SalesOrderID,ProductID)
go
select null as salesorderid,SUM(unitprice),productid
from Sales.SalesOrderDetail
where SalesOrderID between 40000 and 50000
group by ProductID
union
select salesorderid,SUM(unitprice),null
from Sales.SalesOrderDetail
where SalesOrderID between 40000 and 50000
group by SalesOrderID 
-------------------------------------------------------------------------------

TEST Three: Better performance even when filtering on a non-indexed column. Still one clustered index scan instead of two

select salesorderid,SUM(unitPrice),ProductID
from Sales.SalesOrderDetail
where UnitPrice between 10 and 20
group by grouping sets(SalesOrderID,ProductID)
go
select null as salesorderid,SUM(unitprice),productid
from Sales.SalesOrderDetail
where UnitPrice between 10 and 20
group by ProductID
union
select salesorderid,SUM(unitprice),null
from Sales.SalesOrderDetail
where UnitPrice between 10 and 20
group by SalesOrderID
-----------------------------------------------------------------------------

TEST Four: The performance difference increases with each grouping level added.

select salesorderid,SUM(unitPrice),ProductID,SpecialOfferID
from Sales.SalesOrderDetail
where SalesOrderID between 40000 and 50000
group by grouping sets(SalesOrderID,ProductID,SpecialOfferID)
go
select null as salesorderid,SUM(unitprice),productid, null as SpecialOfferID
from Sales.SalesOrderDetail
where SalesOrderID between 40000 and 50000
group by ProductID
union
select salesorderid,SUM(unitprice),null,null
from Sales.SalesOrderDetail
where SalesOrderID between 40000 and 50000
group by SalesOrderID
Union
select null,SUM(unitprice),null,SpecialOfferID
from Sales.SalesOrderDetail
where SalesOrderID between 40000 and 50000
group by SpecialOfferID
 
 

PASS

By Kathi Kellenberger in Advice from Aunt Kathi 11-22-2008 5:09 PM | Categories: Filed under:
Rating: (not yet rated) Rate this |  Discuss | 2,852 Reads | 66 Reads in Last 30 Days |1 comment(s)

I arrived home this morning after another PASS Summit. This is my 6th in a row. Each year I have more fun, get more excited about SQL Server and meet more people.

This year had a different dimension. I had additional options because of being an MVP. There were MVP sessions as well as a reception one night and dinner last night. I also wanted to focus on networking and met with a publisher about a potential book deal. If you were there and spoke with me, you probably know I was sick all week and lost my voice. I didn't get to party with friends as much as normal and sadly didn't get to go out for Karaoke.

As a key volunteer, I know about the challenges PASS has faced the past couple of years. We are working very hard to become more transparent and streamline our processes. We realize that we can never please everybody and must focus on our core values.  Congratulations to Andy Warren. He is one of our new board members. Andy has been a very outspoken critic of PASS, and I expect to see him make a big difference.

Being a key volunteer, as I mentioned, many attendees seek me out to praise the Summit or to pass on their complaints or ideas. One recurring theme I heard is that PASS is different than any other conference. At PASS, there is a sense of community, of family. Not only is it a great technical conference, it is a way to network and meet other people facing the same challenges that you do every day. It is a great way to interact with Microsoft, MVPs, and authors. You may end up at the same table at breakfast with the person who wrote the book that is sitting on your desk at work.

This year I was very honored to be acknowledged by PASS for the work I have done for them. I received the PASSion award given each year to the volunteer who has make the biggest difference to the organization. I've been a volunteer for four years and have enjoyed being part of such a wonderful organization. I encourage you to get involved with PASS.

Some news: Membership is now free! Our print magazine has been discontinued. We will still be publishing great content online and are looking for writers and other content. We have also started a social networking site called PASSPort. Go to SQLPass.org to check out PASS! 


PASS, anyone?

By Kathi Kellenberger in Advice from Aunt Kathi 11-15-2008 12:03 AM | Categories:
Rating: (not yet rated) Rate this |  Discuss | 2,529 Reads | 27 Reads in Last 30 Days |no comments

In less than two days I'll be in Seattle for my 6th PASS Summit. I am trying to put all the important events in Outlook, but I am finding that there is a lot of overlap. Now that I am an MVP, there is even more to consider. I will be speaking on how to get started writing technical articles on Friday, so be sure to check out my session if you are interested in that topic. Somehow, I have managed to get overly involved with PASS lately and find that I have several other responsiblities this week as well. 

This year we are offering both online and paper speaker evals -- so you don't have an excuse to not fill out your evals!  We are also looking for  volunteers to spend an hour or so keying in the paper evals. Please send an email to kkellenbe@hotmail.com if you are interested in helping out.

One other important note. It is once again time to vote for the Board of DIrectors. I have heard that there are three slots open and that six candidates will be running. It is going to be a tough choice, and I am already trying to figure out if I can vote twice so that I can vote for all six. All kidding aside, I am endorsing Lynda Rab, Tom "SQLBatman" LaRock and Andy Warren.