Click here to monitor SSC
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 

Returning 0's in a SUM (SQL Spackle)

By Patrick Cahill,

Introduction

A common request that many of us have had is our manager comes and wants a simple report on total sales for a day summarized by each hour. If you do a simple SUM and GROUP BY hour, you will only show the hours that have data. But your manager wants to see a 0 for the hours with no data. Instead of seeing this:

He wants to see this:

Creating the Test Data

In order to show how to do this I will need some test data. I will create a #SalesTestData table and populate it with 1,000,000 records with dates between Jan 1, 2007 and Dec 31, 2011. Then I will delete all the records that have a time in the 8 or 17 hours. I am also deleting data from 3 days of each month. This is being done so when we run the demo code, there will be some hours that have no data. This way I can demonstrate showing a 0 when there is no data.

The code that I use to populate the test data was taken from scripts written by Jeff Moden.


IF (SELECT OBJECT_ID('tempdb..#SalesTestData')
) IS NOT NULL
DROP TABLE #SalesTestData
/*Create the table and populate it with random dates from Jan 1, 2007 to Dec 31, 2011*/
SELECT TOP 1000000
CAST(RAND(CHECKSUM(NEWID())) * 1826.0 + 39081.0 AS DATETIME) AS SaleTime
,ABS(CHECKSUM(NEWID())) % 50000 + 1 AS SaleQty
INTO dbo.#SalesTestData
FROM master.sys.all_columns AS ac
CROSS JOIN master.sys.all_columns AS ac1
/*Creating a clustered index for performance*/
CREATE CLUSTERED INDEX IXC_#SalesTestData_SaleTime
ON #SalesTestData (SaleTime)
/*Delete all data where the hour is 8 or 17 and where the day is the 9th, 16th or 23rd. This is to show that 0's will be returned for the missing data.*/
DELETE #SalesTestData
WHERE DATEPART(HOUR,SaleTime) IN (8,17)
DELETE #SalesTestData
WHERE DATEPART(day,SaleTime) IN (9,16,23) GO

Summing the Data by Hour

Now that we have the test data I will sum the data for one day by hour using a table variable. A table variable is like a temp table, but you have to declare it first and it will only exist during the current session. The reason for the table variable is so I can create a basic table with all the hours of the day. This table will just be populated for one day with a start hour and a next hour. I will then left join the #SalesTestDate table to this table. I use a left join so when I use the SUM and GROUP BY hour I will get a 0 for those hours without a sale.


DECLARE @Date DATETIME = '08/13/2009' ; 
/*Create a table variable and populate it with StartHour for one day*/ DECLARE @hour TABLE
(StartHour DATETIME
,NextHour DATETIME
) ;
INSERT INTO @hour
(StartHour
,NextHour
)
SELECT DATEADD(hh,Number,@Date) AS StartHour
,DATEADD(hh,Number + 1,@Date) AS NextHour
FROM Master.dbo.spt_Values
WHERE Type = 'P'
AND Number BETWEEN 0 AND 23 ;
SELECT h.StartHour
,COALESCE(SUM(SalesTestData.SaleQty),0) AS Qty
FROM @hour AS h
LEFT OUTER JOIN #SalesTestData AS SalesTestData ON SalesTestData.SaleTime >= h.StartHour
AND SalesTestData.SaleTime < NextHour
AND SalesTestData.SaleTime >= @Date
AND SalesTestData.SaleTime < DATEADD(dd,1,@Date)
GROUP BY h.StartHour GO

Summarizing by Month with a Table Variable

Now I will sum the data by day for one month. I am running for April 2010, but you can change this to run for any month that there is data for. In this example I am creating a table variable and populating it with the days of the month. I then join this table to the #SalesTestData table and do my summary. I am using a left join on the #SalesTestData table. This way I will still get a 0 record for the days that there is no data. If I did an inner join then I would not get a record returned for those days.


/*To get the Sales count by day without a calendar table*/
DECLARE @DataMonth INT
,@StartDate DATETIME
,@NextDate DATETIME
,@LastDay INT ; /*Calculating the start date based on the month */
SET @DataMonth = 201004 ;
SET @StartDate = CONVERT(DATETIME,LEFT(@DataMonth,4) + '-' + RIGHT(@DataMonth,2) + '-01') ;
SET @NextDate = DATEADD(MONTH,1,@StartDate) ;
SET @LastDay = DATEPART(d,DATEADD(d,-1,@NextDate)) -1 ;
DECLARE @date TABLE
(StartDate DATETIME
,NextDate DATETIME
) ;
INSERT INTO @date
(StartDate
,NextDate
)
SELECT DATEADD(d,Number,@StartDate) AS StartDate
,DATEADD(d,Number + 1,@StartDate) AS NextDate
FROM Master.dbo.spt_Values
WHERE Type = 'P'
AND Number BETWEEN 0 AND @LastDay ;
SELECT h.StartDate AS [Date]
,COALESCE(SUM(SalesTestData.SaleQty),0) AS Qty
FROM @date AS h
LEFT OUTER JOIN #SalesTestData AS SalesTestData ON SalesTestData.SaleTime >= h.StartDate AND SalesTestData.SaleTime < h.NextDate
and SalesTestData.SaleTime >= @StartDate AND SalesTestData.SaleTime < @NextDate
GROUP BY h.StartDate ; GO

Creating a Calendar Table

Next let's create a calendar table and populate it. I'm populating the calendar table for the years 2007 to 2011, this way it will match the dates in the #SaleTestData table. It is a simple calendar table with a start date and next date for each day, and a MonthId. If I was going to create calendar table for my production system it would probably be a little more complicated, but for my demonstration this is all I need. I will then use this table to do a sum with it instead of the table variable.


DECLARE @StartDate DATETIME ;
/*Date to start populating the calendar table. Populates for 5 years*/ SELECT @StartDate = '01/01/2007' ;
IF (SELECT OBJECT_ID('tempdb..#CalendarTest')
) IS NOT NULL
DROP TABLE #CalendarTest
SELECT CAST (CONVERT(CHAR(8),DATEADD(d,v.Number,@StartDate),112) AS INT) AS CalendarId
,DATEADD(d,v.Number,@StartDate) AS StartDate
,DATEADD(d,v.Number,@StartDate)+1 AS NextDate
,DATEPART(dd,DATEADD(d,v.Number,@StartDate)) AS DateOfMonth
,CONVERT(INT,REPLACE(CONVERT(CHAR(7),DATEADD(d,v.Number,@StartDate),102),'.','')) AS MonthId
INTO #CalendarTest
FROM Master.dbo.spt_Values v
WHERE Type = 'P'
AND v.Number BETWEEN 0 AND 1825 ;
--Creating a clustered index for performance
CREATE CLUSTERED INDEX IXC_#CalendarTest_CalendarId
ON #CalendarTest (CalendarId) GO

Summarizing by Month with a Calendar Table

In this example I am using a calendar table instead of the table variable. Again I am using a left join on the calendar table. This code is a little cleaner than using the table variable plus if you have a calendar table it can be reused in other code, while the table variable is only good for the script that it was written in.

/*To get the Sales count by day with a Calendar table*/
DECLARE @DataMonth INT
,@StartDate DATETIME
,@NextDate DATETIME ;
SET @DataMonth = 201004 ;
SELECT @StartDate = MIN(StartDate)
,@NextDate = MAX(NextDate)
FROM #CalendarTest
WHERE MonthId = @DataMonth ;
SELECT CalendarTest.StartDate AS [Date]
,COALESCE(SUM(SalesTestData.SaleQty),0) AS Qty
FROM #CalendarTest AS CalendarTest
LEFT OUTER JOIN #SalesTestData AS SalesTestData ON DATEPART(DAY,SalesTestData.SaleTime) = CalendarTest.DateOfMonth
AND SalesTestData.SaleTime >= @StartDate AND SalesTestData.SaleTime < @NextDate
WHERE CalendarTest.MonthId = @DataMonth
GROUP BY CalendarTest.StartDate
ORDER BY CalendarTest.StartDate ; GO

I know there are other ways to accomplish this. This is just one example of how to do it.

 
Total article views: 7786 | Views in the last 30 days: 4
 
Related Articles
ARTICLE

Bones of SQL - Practical Calendar Queries

After the first article created a Calendar table, learn how to use this in a practical sense with so...

FORUM

Temporary Table

Calendar

ARTICLE

Holidays and Calendar Generation

Create the definition of your holidays and have them available for any year. Create a Calendar incl...

FORUM

dateAdd inside where clause

dateAdd function syntax help

FORUM

Parameter Startdate, Enddate validation

SSRS Parmeters : startdate should not be later than enddate

Tags
sql spackle    
t-sql    
 
Contribute