RE: Calendar Tables – SQLServerCentral

SSC Guru

Points: 1003863

April 23, 2014 at 1:13 am

Heh... the Tally table took one on the chin in this article and, after 3 1/2 years (hopefully, that's long enough to not detract from a valuable article written by a decent author that's trying to help), thought I'd set things straight about that especially since the Tally Table seems to be falling out of favor on the forums especially as of late.

Like I said in a previous comment on this thead, the Tally Table isn't at fault for the abysmal performance of the original Tally Table code cited in the article. It was just crap code to begin with. I built the test table including all the indexes and ran the original Tally Table code against it. Here are the results on my slower 12 year old, single 32 bit CPU, 1GB Ram desktop at home on SQL Server 2005.

(12 row(s) affected)

Table 'Worktable'. [font="Arial Black"]Scan count 1, logical reads 887369[/font], physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table '#Stay_______________________________________________________________________________________________________________000000000007'. Scan count 1, logical reads 1390, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'Tally'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:

CPU time = 10891 ms, elapsed time = 11716 ms.

Does anyone see a problem in that? :sick:

Shifting gears for a second, you might want to know why I'm running this code on such a slow box. Heh... patience... you'll see.

Up next, I ran Todd's good code using the Calendar Table he built. His code was was a marked improvement (a gross understatement on my part, for sure) both in performance and resource usage. Here are the stats from his code.

SQL Server Execution Times:

CPU time = 0 ms, elapsed time = 0 ms.

Table 'CalMonth'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:

CPU time = 0 ms, elapsed time = 8 ms.

SQL Server Execution Times:

CPU time = 0 ms, elapsed time = 0 ms.

(12 row(s) affected)

Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table '#Stay_______________________________________________________________________________________________________________000000000007'. Scan count 24, logical reads 1518, physical reads 0, read-ahead reads 3, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'CalMonth'. Scan count 2, logical reads 4, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:

CPU time = 2203 ms, elapsed time = 2598 ms.

Now, let's take a look at some code that does use a Tally Table in a more appropriate manner. I took a liberty with the assignment of the FY starting dates just to keep things simple and because that bit of code has no bearing whatsover on performance.

----------------------------------------------------------------------------

SET STATISTICS TIME, IO ON;

DECLARE @StartMonth DATETIME;

SELECT @StartMonth = 'Jun 2008';

WITH

cteCalendar AS

( --=== Eliminates the need to maintain a calendar table by using a Tally Table.

-- The date formulas just don't need to be complicated as in the original

-- code being cited as a problem in the article.

SELECT MonthStart = DATEADD(mm,t.N-1,@StartMonth)

,MonthOrd = t.N-1

,MonthDescr = RIGHT(CONVERT(CHAR(11),DATEADD(mm,t.N-1,@StartMonth),106),8)

FROM dbo.Tally t

WHERE t.N <= 24

),

ctePreAgg AS

( --=== Preaggregate the data by Month serial number.

-- Preaggregation helps exceed the need for speed. ;-)

SELECT MonthStart = DATEADD(mm,DATEDIFF(mm,0,ArrivalDate),0)

,MonthRevenue = SUM(TotalRevenue)

FROM #Stay

WHERE ArrivalDate >= @StartMonth AND ArrivalDate < DATEADD(yy,2,@StartMonth)

GROUP BY DATEDIFF(mm,0,ArrivalDate)

)

--===== This simply pivots the data using a Classic Crosstab and displays it correctly.

SELECT Year1 = MAX(CASE WHEN ca.MonthOrd/12 = 0 THEN ca.MonthDescr ELSE '' END)

,Year1Rev = SUM(CASE WHEN ca.MonthOrd/12 = 0 THEN pa.MonthRevenue ELSE 0 END)

,Year2 = MAX(CASE WHEN ca.MonthOrd/12 = 1 THEN ca.MonthDescr ELSE '' END)

,Year2Rev = SUM(CASE WHEN ca.MonthOrd/12 = 1 THEN pa.MonthRevenue ELSE 0 END)

FROM ctePreAgg pa

RIGHT JOIN cteCalendar ca ON pa.MonthStart = ca.MonthStart

GROUP BY ca.MonthOrd%12

ORDER BY ca.MonthOrd%12

SET STATISTICS TIME, IO OFF;

;

Here are the stats on that. Note that the ISNULLs are there to prevent the "null agregate" warnings that pop up because of the outer join when associated with a joined CROSSTAB.

(12 row(s) affected)

Table 'Worktable'. Scan count 1, logical reads 85, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'Tally'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:

CPU time = 453 ms, elapsed time = 556 ms.

That's 3.67 times (367% if you want to impress the boss) faster than the Calendar Table code and even uses a few less reads not to mention it only has to hit the two tables once each instead of twice each. To be fair, though, you can get the same kind of speed out of the Calendar Table, as well.

Like I said, the Tally Table wasn't the problem and the reason why I used my slower machine was to show that you don't need fancy hardware to get performance out of code, either. Consider that my Tally Table code is more than 20 times faster (2,000%) than the original Tally Table code cited in the article. There are machines (appliances, really) that you can buy that will give you 31 times improvement on such crap code but just imagine how expensive they would be and how the licensing for the 128 processors and the special version of SQL Server is going to cost you (and that's only if it actually helps). 😉

For those interested, the Tally Table code I wrote above follows a simple principle that all Developers should be aware of... keep the data layer separate from the presentation layer. Creating the 24 months of calendar data from the Tally Table and the preaggregation of the sums make up the data layer. Doing the final GROUP BY to power the CROSSTAB to get the output into the final format and apply the formatted dates is patently a display layer function. Keeping the two separate makes the code simple to read (we don't need the same function all over hell's half acre) and we don't end up with the accidental cross join that caused nearly 900,000 reads instead of less than 1,500 reads.

You'll also notice that I didn't use any really complicated DATETIME formulas (although it does require an understanding of "0" based ordinals and modulus... something that every SQL Developer should become good at) nor did I use the Calendar CTE (or Calendar Table if you go that route) that I created using the Tally Table for anything except to allow the return of zero values for missing months and to provide an easy way to display the custom date format. Everything was preaggregated before that. Admittedly, it's not always that easy. Doing things like counting out to the 5th business day from today with observances for weekends and holidays or some of the things that Bill Talada cited, are made a whole lot simpler by Calendar Tables.

Shifting gears again, on a faster machine, the differences between the code examples would be much smaller so, you might ask, why worry about it? Ask all the people that complain on these fine forums about how bogged down their servers are.

As Granny used to say, "Mind the pennies and the dollars will take care of themselves". 🙂

Todd, I'll say it again. Nice article and thank you for a great introduction to Calendar Tables.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)