SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 

SQL #56–Aggregation with ROLLUP operator is not fully supported

ROLLUP, CUBE, and GROUPING SETS Operators have been around since SQL Server 2005. These operators are extensions of the GROUP BY clause.

The GROUPING SETS operator might be introduced later in SQL Server 2008.

I only have experience with the ROLLUP operator. My test of GROUPING SET also suggested that it is not fully supported either.

The ROLLUP operator is useful in generating reports that contain subtotals and totals

The ROLLUP operator is useful in generating reports that contain subtotals and totals in the same dataset.

This simple GROUP BY query with a ROLLUP operator generates a grand total rows at the end.

pic1

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Dynamic SQL and aggregates with the ROLLUP are commonly used in web reporting project

You might be wondering where you can actually use this feature. I never used it until I was involved in a web reporting project. To centralize all the data manipulation logic in the database layer, the decision was to write a very flexible stored procedure that meets the following three requirements:

  • All metrics calculations must be table-driven.
  • Only one stored procedure should be used.
  • The stored procedure should return subtotals and grand totals in the same result set.

The first two requirements forced us to generate dynamic SQLs instead of static SQL queries. The last requirement forced us to resort to the ROLLUP operator with aggregations.

This all worked well until we had a new requirement for an additional metric, which needs to calculate:

  • A percentage of a COUNT for group over a subtotal.

SUM(COUNT(*)) OVER() can generate an aggregate count

My first reaction is to simply add a TOTAL_CNT to the query using SUM(COUNT(*)) OVER(). Then the PCT_OF_TOTAL will be just the CNT over the TOTAL_CNT.

My thought was right on the mark, until I noticed that ROLLUP operator altered two calculations, TOTAL_CNT and PCT_OF_TOTAL.

To try to figure out what exactly went wrong, I tested a simple aggregation query.

Here I added a few things to the first query.

  • VacationHours and AVG_VacationHours: I wanted to make sure that a simple averaging would work always.
  • TOTAL_CNT: I used a SUM(COUNT(*)) OVER() to get the total count over all the DepartmentName.
  • PCT_OF_TOTAL: this is just the CNT over the TOTAL_CNT.

This query has no ROLLUP or GROUPING SETS. All numbers look good including the TOTAL_CNT and PCT_OF_TOTAL

 pic2

 

 

 

 

 

 

 

 

 

 

 

 

 

 

GROUPING SETS and ROLLUP messed up the SUM(COUNT(*)) OVER()

Next, I added the GROUPING SETS operator.

with the GROUPING SETS operator, a grand total row was added to the result set. The count and the average are all correct, but the TOTAL_CNT and the PCT_OF_TOTAL has been messed up.

pic3

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Looking closely, you can see that the TOTAL_CNT was doubled. You might be tempted to try to correct it by dividing by 2 to get it right. But adding more hierarchies to the aggregation generates even more confusing results.

ROLLUP operator does not support the SUM() OVER()

I was quite disappointed that the ROLLUP operator does not support the SUM() OVER(). This basically means that I’d need to write a lot more inefficient code to achieve this. With the dynamic queries in the midst, the coding will not be as efficient and elegant.

The closest example I could find is from this SQL Magazine article from 2012 by D. Yitzhak, in which only examples of ranking functions with the ROLLUP, CUBE, and GROUPING SETS operators were given.

Take Advantage of SQL Server 2012’s Window Functionality

GROUPING SETS is supposed to supersede ROLLUP

In the last query, I used the GROUPING SETS instead of the ROLLUP operator. This is because GROUPING SETS is a newer additions to SQL Server, and is supposed to supersede ROLLUP. According to many blogs, the WITH ROLLUP and WITH CUBE features will be removed in future versions of SQL Server. But I have not seen any official statement from Microsoft.

For more on GROUPING SETS in SQL Server 2008 R2, check out these TechNet article:

OVER Clause

OVER Clause has been around in SQL Server for a long time. If you have used any of the following types of functions, you are already familiar with it:

  • Ranking functions
  • Aggregate functions
  • Analytic functions
  • NEXT VALUE FOR function

The OVER clause defines a window within a query result set. A window function then computes a value for each row in the window. You can use the OVER clause with functions to compute aggregated values such as

  • moving averages
  • cumulative aggregates
  • running totals
  • top N per group results

In the following example, adding an ORDER BY clause to the SUM(COUNT(*)) OVER() generates running totals.

pic4

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

For more information, check out this MSDN SQL Server 2014 article:

OVER Clause (Transact-SQL)

 


Sherry Li's BI Corner

Always wanting to publish novels, but having the fortune to work in the data warehouse corner of technology, Sherry Li started to write the mysteries of the Microsoft Business Intelligence. She writes everything from T-SQL to MDX, ETL to Expressions to Scripting, Reporting to Cubes. You can find her writings at bisherryli.wordpress.com.

Comments

Leave a comment on the original post [bisherryli.com, opens in a new window]

Loading comments...