Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase «««1234»»

Generating SubTotals using GROUPING Expand / Collapse
Author
Message
Posted Monday, July 19, 2010 6:34 AM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Yesterday @ 11:30 PM
Points: 36,706, Visits: 31,156
Hugo,

The gain is that there is nothing you can naturally order by that will give you the correct original order. And stop playing the "gotcha back" tricks... can you make the natural sort fail or not?


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #954772
Posted Monday, July 19, 2010 2:40 PM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Today @ 3:21 AM
Points: 5,916, Visits: 8,164
Jeff Moden (7/19/2010)
Hugo,

The gain is that there is nothing you can naturally order by that will give you the correct original order.

Huh?
USE AdventureWorks2008;
SELECT TerritoryID, GROUPING(TerritoryID),
SalesPersonID, GROUPING(SalesPersonID),
SUM(TotalDue) AS SumOfTotalDue
FROM Sales.SalesOrderHeader
GROUP BY ROLLUP(TerritoryID, SalesPersonID)
ORDER BY GROUPING(TerritoryID), TerritoryID,
GROUPING(SalesPersonID), SalesPersonID;

And stop playing the "gotcha back" tricks... can you make the natural sort fail or not?

Whether or not I can is completely irrelevant. I advise people to include the ORDER BY, because it guarantees correct results. You advise people to omit the ORDER BY clause; in my opinion that places the burden on you to prove that you will never need it. Not even after installing a service pack or upgrading to the next version of SQL Server. Can you guarantee that?
Heck, can you even guarantee that this will always work on the current version of SQL Server? I'm on holiday, with only a simple laptop computer, so my testing capacity is limited. But imagine a SQL Server instance on a computer with 16 cores, with Sales.SalesOrderHeader partitioned by TerritoryID and spread over 16 seperate spindles, and with the amount of data in that table bumped to several billion rows. I would expect (and, in fact, even HOPE) that each core gets to process the data for a single partition. Without the ORDER BY, the data will be returned as soon as it's ready. Why would SQL Server wait for core #1 to finish if core #5 already has some data ready to be returned?

Again - if you advise people to rely on undocumented behaviour, the burden is on you to prove that it will work in all cases. And that it will continue to work in the future. If you can't, and you still keep giving the same advise, you are repeating the mistake many people made in the days of SQL Server 6.5 when they told people to omit the ORDER BY after a GROUP BY.



Hugo Kornelis, SQL Server MVP
Visit my SQL Server blog: http://sqlblog.com/blogs/hugo_kornelis
Post #955119
Posted Monday, July 19, 2010 3:31 PM
SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Monday, June 30, 2014 6:34 AM
Points: 411, Visits: 1,398
Guys, this topic was about the article on the usage of grouping() but you've moved away to wether or not to add a sort clause now. I feel a little guilty for that because I first made a note on the sorting. You've both made your points and since neither can give proof there's no need to keep poluting the topic any more. I personally totally see Hugo's point, so I'll keep on adding the order by clause. Even though I totally respect -if not live by- Jeff's advices normally. Since the sorting is undocumented behavior (unless someone else has a link?) I think it's up to each and everyone to decide for themselves whether or not their solution should rely on it and I don't see any reason not to add the order by clause. If you want to continue this debate, can you please start a separate topic for it?



Posting Data Etiquette - Jeff Moden
Posting Performance Based Questions - Gail Shaw
Hidden RBAR - Jeff Moden
Cross Tabs and Pivots - Jeff Moden
Catch-all queries - Gail Shaw


If you don't have time to do it right, when will you have time to do it over?
Post #955140
Posted Monday, July 19, 2010 3:44 PM


SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Sunday, April 27, 2014 6:26 AM
Points: 1,521, Visits: 3,036
Gosh, I hope I don't get my shoes wet by stepping near this, but.....

The original script in the article absolutely needs an ORDER BY to properly interleave the two halves of the UNION ALL statement (the detail rows and the aggregated rows).

Also, BOL is quite clear in saying that GROUP BY results are not guaranteed to be in any particular order and explicitly prescribes an ORDER BY clause if the order is important. With no direction from MS either way on queries using WITH ROLLUP or ROLLUP(), I'd have to agree that this advice would best be followed in those cases as well.

-----
edit:
Sorry, R.P. I saw your post only after hitting send on mine.
Post #955150
Posted Monday, July 19, 2010 3:53 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Yesterday @ 11:30 PM
Points: 36,706, Visits: 31,156
Hugo Kornelis (7/19/2010)
Jeff Moden (7/19/2010)
Hugo,

The gain is that there is nothing you can naturally order by that will give you the correct original order.

Huh?
USE AdventureWorks2008;
SELECT TerritoryID, GROUPING(TerritoryID),
SalesPersonID, GROUPING(SalesPersonID),
SUM(TotalDue) AS SumOfTotalDue
FROM Sales.SalesOrderHeader
GROUP BY ROLLUP(TerritoryID, SalesPersonID)
ORDER BY GROUPING(TerritoryID), TerritoryID,
GROUPING(SalesPersonID), SalesPersonID;

And stop playing the "gotcha back" tricks... can you make the natural sort fail or not?

Whether or not I can is completely irrelevant. I advise people to include the ORDER BY, because it guarantees correct results. You advise people to omit the ORDER BY clause; in my opinion that places the burden on you to prove that you will never need it. Not even after installing a service pack or upgrading to the next version of SQL Server. Can you guarantee that?
Heck, can you even guarantee that this will always work on the current version of SQL Server? I'm on holiday, with only a simple laptop computer, so my testing capacity is limited. But imagine a SQL Server instance on a computer with 16 cores, with Sales.SalesOrderHeader partitioned by TerritoryID and spread over 16 seperate spindles, and with the amount of data in that table bumped to several billion rows. I would expect (and, in fact, even HOPE) that each core gets to process the data for a single partition. Without the ORDER BY, the data will be returned as soon as it's ready. Why would SQL Server wait for core #1 to finish if core #5 already has some data ready to be returned?

Again - if you advise people to rely on undocumented behaviour, the burden is on you to prove that it will work in all cases. And that it will continue to work in the future. If you can't, and you still keep giving the same advise, you are repeating the mistake many people made in the days of SQL Server 6.5 when they told people to omit the ORDER BY after a GROUP BY.


I humbly stand corrected. That sort works. Well done, Hugo. Still, the order of run return is, in fact, documented in Books Online...

ROLLUP
Specifies that in addition to the usual rows provided by GROUP BY, summary rows are introduced into the result set. Groups are summarized in a hierarchical order, from the lowest level in the group to the highest. The group hierarchy is determined by the order in which the grouping columns are specified. Changing the order of the grouping columns can affect the number of rows produced in the result set.



--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #955154
Posted Monday, July 19, 2010 4:00 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Yesterday @ 11:30 PM
Points: 36,706, Visits: 31,156
R.P.Rozema (7/19/2010)
Guys, this topic was about the article on the usage of grouping() but you've moved away to wether or not to add a sort clause now. I feel a little guilty for that because I first made a note on the sorting. You've both made your points and since neither can give proof there's no need to keep poluting the topic any more. I personally totally see Hugo's point, so I'll keep on adding the order by clause. Even though I totally respect -if not live by- Jeff's advices normally. Since the sorting is undocumented behavior (unless someone else has a link?) I think it's up to each and everyone to decide for themselves whether or not their solution should rely on it and I don't see any reason not to add the order by clause. If you want to continue this debate, can you please start a separate topic for it?


The sorting is not undocumented... and that was my point. And, I am truly humbled by your comment. Thank you.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #955164
Posted Tuesday, July 20, 2010 4:16 AM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Today @ 3:21 AM
Points: 5,916, Visits: 8,164
My final comment in this topic, and then I'll respect R.P.'s request.

Jeff Moden (7/19/2010)
Still, the order of run return is, in fact, documented in Books Online...

ROLLUP
Specifies that in addition to the usual rows provided by GROUP BY, summary rows are introduced into the result set. Groups are summarized in a hierarchical order, from the lowest level in the group to the highest. The group hierarchy is determined by the order in which the grouping columns are specified. Changing the order of the grouping columns can affect the number of rows produced in the result set.


This quote describes which possible summary rows are and are not introduced in the result set, not the order in which results are returned.

(I'm also tempted to point out the use of the term "result set" rather than "recordset" in this quote, but the terminology in BOL as a whole is so often wrong that I'd probably better not go there)

Jeff, or anyone else - if you want to continue this debate, then please start a new topic and send me a PM with the link, or include the link in this topic. Without a link, I'll probably never find the new topic.



Hugo Kornelis, SQL Server MVP
Visit my SQL Server blog: http://sqlblog.com/blogs/hugo_kornelis
Post #955410
Posted Tuesday, July 20, 2010 11:05 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Wednesday, July 21, 2010 9:37 AM
Points: 1, Visits: 6
The result set I get is not the same as the result set posted in the article. Can you repost the code to populate the tables? I had a DBA run it here at my company and he got the same results that I got.

First 10 rows:
'Product1', 'Scheme1', 1524.15, 124.32, '2010-07-20 16:39:32'
'Product1', 'Scheme2', 1524.15, 124.32, '2010-07-20 16:39:32'
'Product1', 'Scheme3', 1524.15, 124.32, '2010-07-20 16:39:32'
'Product1', 'Scheme4', 1524.15, 124.32, '2010-07-20 16:39:32'
'Product1', 'Scheme5', 1524.15, 124.32, '2010-07-20 16:39:32'
'Product1', 'Scheme1', 9144.90, 745.92, '2010-07-20 16:39:32'
'Product1', 'Scheme2', 4572.45, 372.96, '2010-07-20 16:39:32'
'Product1', 'Scheme3', 3048.30, 248.64, '2010-07-20 16:39:32'
'Product1', 'Scheme4', 3048.30, 248.64, '2010-07-20 16:39:32'
'Product1', 'Scheme5', 3048.30, 248.64, '2010-07-20 16:39:32'
Post #955766
Posted Tuesday, July 20, 2010 11:35 AM


SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Sunday, April 27, 2014 6:26 AM
Points: 1,521, Visits: 3,036
witasj,

I see the same contents in @Grouptest as you do.

It looks as though the loaded table shown in the article may not be from the final version of the code to build it, but the results with aggregated subtotals do seem to match up to what's shown.
Post #955788
Posted Tuesday, July 20, 2010 12:05 PM


SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Sunday, April 27, 2014 6:26 AM
Points: 1,521, Visits: 3,036
Here one question comes in the mind that why only one row is indicated by 1? The Scheme_Total and Product_Total is also aggregated but it is not indicated by 1. I have not yet received any satisfactory answer for this. I will post the answer here once I get it.


I tried to answer this question from the article last week, and realized today that I was wrong in my explanation. I've corrected that post above.
Post #955811
« Prev Topic | Next Topic »

Add to briefcase «««1234»»

Permissions Expand / Collapse