Calendar Tables

  • I have been doing this for few years now and was wondering if I was the only person doing it

    It is good to see that many have come to the same logical reasoning

    I also import and set the Public holidays of every year in to the system which becomes usefull for payroll type of applications

  • Worth mentioning that this technique works well on any data subject area where there is a defined set of answers.

    For example, permutation/combination figures, significance tables.

    I was quite proud of coming up with efficient formulae for handling nPr, nCr stuff in http://www.sqlservercentral.com/articles/Troubleshooting/75990/ but suffered the DOH! moment when someone pointed out I could actually store the values in a table instead!

  • Excellent article! Thank you.

    Now I have a question for you more experienced guys and gals. 😀

    If I use a tally table like the one below would that be alright? I'm

    trying to use a single tally table for numbers and dates as well.

    IF OBJECT_ID('dbo.Tally', 'U') IS NOT NULL

    DROP TABLE dbo.Tally;

    GO

    DECLARE @StartDate DATETIME = '18501231';

    SELECT Tab.Number,

    @StartDate + Tab.Number AS Date,

    MONTH(@StartDate + Tab.Number) AS Month,

    YEAR(@StartDate + Tab.Number) AS Year

    INTO dbo.Tally

    FROM (SELECT TOP(100000)

    ROW_NUMBER() OVER (ORDER BY C1.column_id)

    FROM model.sys.columns AS C1

    CROSS JOIN model.sys.columns AS C2

    CROSS JOIN model.sys.columns AS C3) AS Tab(Number);

    ALTER TABLE dbo.Tally

    ALTER COLUMN Number INT NOT NULL;

    GO

    ALTER TABLE dbo.Tally

    ADD CONSTRAINT PK_Tally PRIMARY KEY CLUSTERED(Number)

    WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF,

    FILLFACTOR = 100);

    GO

    CREATE NONCLUSTERED INDEX IX_Tally_Date ON dbo.Tally(Date)

    WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF,

    FILLFACTOR = 100);

    GO

    CREATE NONCLUSTERED INDEX IX_Tally_Year ON dbo.Tally(Year)

    WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF,

    FILLFACTOR = 100);

    GO

    EXEC sp_spaceused 'dbo.Tally';

    -- Tally100000 8512 KB3328 KB4432 KB752 KB

    Or should I just remove the Number column and calculate the ROW_NUMBER everytime I use the table? I'm trying to think performance-wise. Have a great weekend y'all!

    Best regards,

    Andre Guerreiro Neto

    Database Analyst
    http://www.softplan.com.br
    MCITPx1/MCTSx2/MCSE/MCSA

  • Yes, I commonly use the yearmonth (YYYYMM) scenario. However, I use the fieldname TimeKey so as not to argue with the system about any reserved words. Additionally, I use an integer field in the table to allow queries easily to span segments that include differing years. The integer opens the door to arithmetic operations for query construction.

    Example:

    SequenceNumber TimeKey

    11 201111

    12 201112

    13 201201

    14 201212 & C.

    In addition, I use the fields StartKey and StopKey (with a default 999999 on the second) as a handy way of identifying SCD lookups for historical data. This system is very useful when working with accounting systems where discreet monthly data is used.

  • codebyo (2/25/2012)


    Excellent article! Thank you.

    Now I have a question for you more experienced guys and gals. 😀

    If I use a tally table like the one below would that be alright? I'm

    trying to use a single tally table for numbers and dates as well.

    IF OBJECT_ID('dbo.Tally', 'U') IS NOT NULL

    DROP TABLE dbo.Tally;

    GO

    DECLARE @StartDate DATETIME = '18501231';

    SELECT Tab.Number,

    @StartDate + Tab.Number AS Date,

    MONTH(@StartDate + Tab.Number) AS Month,

    YEAR(@StartDate + Tab.Number) AS Year

    INTO dbo.Tally

    FROM (SELECT TOP(100000)

    ROW_NUMBER() OVER (ORDER BY C1.column_id)

    FROM model.sys.columns AS C1

    CROSS JOIN model.sys.columns AS C2

    CROSS JOIN model.sys.columns AS C3) AS Tab(Number);

    ALTER TABLE dbo.Tally

    ALTER COLUMN Number INT NOT NULL;

    GO

    ALTER TABLE dbo.Tally

    ADD CONSTRAINT PK_Tally PRIMARY KEY CLUSTERED(Number)

    WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF,

    FILLFACTOR = 100);

    GO

    CREATE NONCLUSTERED INDEX IX_Tally_Date ON dbo.Tally(Date)

    WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF,

    FILLFACTOR = 100);

    GO

    CREATE NONCLUSTERED INDEX IX_Tally_Year ON dbo.Tally(Year)

    WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF,

    FILLFACTOR = 100);

    GO

    EXEC sp_spaceused 'dbo.Tally';

    -- Tally100000 8512 KB3328 KB4432 KB752 KB

    Or should I just remove the Number column and calculate the ROW_NUMBER everytime I use the table? I'm trying to think performance-wise. Have a great weekend y'all!

    Part of the "speed" factor of a Tally table is that it's densely packed with only the information it needs. Adding the columns you have will make it necessary to read at least twice as many pages to do any job that uses it. Further, you've also just increased the "in memory" size of the table when it does cache.

    My recommendation is to never mix the Tally Table with a Calendar table of any type. It's not good for the Tally side of it all.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (2/27/2012)


    Part of the "speed" factor of a Tally table is that it's densely packed with only the information it needs. Adding the columns you have will make it necessary to read at least twice as many pages to do any job that uses it. Further, you've also just increased the "in memory" size of the table when it does cache.

    My recommendation is to never mix the Tally Table with a Calendar table of any type. It's not good for the Tally side of it all.

    I see what you mean. My tally/calendar table would behave just like an ordinary table being joined, right? No real gain there. Lesson learned. Thank you again, Jeff.

    Best regards,

    Andre Guerreiro Neto

    Database Analyst
    http://www.softplan.com.br
    MCITPx1/MCTSx2/MCSE/MCSA

  • Great article and technique.

    How about dates stored in UTC? How do you configure the date table then? Do you create a date table that is also in UTC? Does this even make a difference? UTC would be the same as the dates then, right?

  • michaelm-746562 (2/27/2012)


    Additionally, I use an integer field in the table to allow queries easily to span segments that include differing years. The integer opens the door to arithmetic operations for query construction.

    Example:

    SequenceNumber TimeKey

    11 201111

    12 201112

    13 201201

    14 201212 & C.

    Sorry for the late response but I lost track of this thread. A recent post to it "found" it for me.

    What kind of "arithmetic operations" have you actually used against that column and what do they do?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Sorry for the late response... I'm just getting caught up on some newsletters where I've been too busy to read.

    I think that you did a great job in this article, except in one place (and I'm somewhat surprised that Jeff didn't mention this).

    So, in all of this work to build a calendar table to create faster, more efficient set-based queries, you end up using a loop instead of a set-based method to generate your table... here's a set-based method to generate the calendar table:

    WITH Tens (N) AS (SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1),

    Thousands (N) AS (SELECT t1.N FROM Tens t1, Tens t2, Tens t3),

    Millions (N) AS (SELECT t1.N FROM Thousands t1, Thousands t2),

    Tally (N) AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM Millions),

    Months (N, MonthStart) AS (SELECT TOP (25*12)

    N, DATEADD(MONTH, N, '1999-12-01T00:00:00')

    FROM Tally)

    SELECT N,

    MonthStart,

    NextMonth = DATEADD(MONTH, 1, MonthStart),

    MonthDescr = CONVERT(CHAR(3), caMonth.MonthName) + '-' +

    RIGHT(CONVERT(CHAR(4), YEAR(MonthStart)),2),

    caMonth.MonthName,

    YearMonth = (YEAR(MonthStart)*100) + caMonth.MonthNbr,

    caMonth.MonthNbr

    FROM Months

    CROSS APPLY (SELECT MonthNbr = DATEPART(MONTH, Months.MonthStart),

    MonthName = DATENAME(MONTH, Months.MonthStart)) caMonth

    ORDER BY Months.MonthStart;

    You can use SET IDENTITY_INSERT to insert N into the Identity column. And to address the Y2K issue brought up (running out of dates), just change the "25" to 5000 to take it up to year 7000. Now, if queries based on this table are still in use then, I think we've got bigger problems to worry about... :-D:w00t:

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • WayneS (6/23/2012)


    I think that you did a great job in this article, except in one place (and I'm somewhat surprised that Jeff didn't mention this).

    Including the WHILE loop, there are good many things about Todd's Calendar Table code that I would have someone change to pass a production code review but didn't mention because I didn't want to take any focus away from what Todd was trying to say in his good article. The point he's trying to get across is much more important.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Is there an error in the last code section of your article?

    snippit:

    -- This returns the first day of the current month.

    SET @StartYear2 = DATEADD(YY, -1, DATEADD(MM, DATEDIFF(MM, 0, GETDATE()), 0));

    -- Get the MonthID for this month.

    SELECT

    @StartMonth = MonthID

    FROM CalMonth

    WHERE MonthStart = @StartYear2;

    -- Set to same month the year before.

    SET @StartMonth = @StartMonth - 12

    I would think that the @StartMonth should be -24 instead of -12. (2 years of data)

    It also fixes the last 5 lines of SecondYearMonth(same as FirstYearMonth) and Revenue=0.00.

    FirstYearMonth Year1Revenue SecondYearMonth Year2Revenue MonthOrd

    JUN-08 814652.29 JUN-09 815557.55 0

    JUL-08 850882.41 JUL-09 831792.32 1

    AUG-08 844237.14 AUG-09 837465.42 2

    SEP-08 836915.91 SEP-09 833296.03 3

    OCT-08 838332.34 OCT-09 855292.19 4

    NOV-08 819412.58 NOV-09 827802.45 5

    DEC-08 838095.67 DEC-09 851131.06 6

    JAN-09 858064.48 JAN-09 0.00 7

    FEB-09 771925.00 FEB-09 0.00 8

    MAR-09 855450.64 MAR-09 0.00 9

    APR-09 821391.97 APR-09 0.00 10

    MAY-09 853181.60 MAY-09 0.00 11

  • This article is an awesome start. I assume that anyone can take this knowledge and tailor it to fit their needs. I have a few questions though:

    1) What are the cons of using this table?

    2) Does this table take into account different time zones, holidays, leap year etc..

    or would that be conditions that the developer would have to add to their queries?

    3) What are some things to be aware of when querying this table?

    Thanks for a great article/lesson. Cheers!

    Mark

  • I've used calendar tables for all manufacturing sites; they always contained one row per day.

    In my opinion the very biggest reason to create a calendar table is to know the cross referencing of all the following columns for any given date:

    Julian date

    gregorian date

    accounting month,

    accounting week of month,

    serial day in accounting month,

    serial day in accounting year,

    daylight savings time indicator

    company holiday indicator

    manufacturing workday indicator,

    First, second, and third shift hours,

    staff workday indicator,

    Day of week

    Knowing all of these makes it easy to calculate the time to complete a manufacturing order and which accounting periods are affected. And it is easy to convert other types of dates depending on whom you are reporting to.

  • I've used tables containing base data for calculating TimeZones and Daylight Savings Times for any year. But, let me tell you they were a royal mess to deal with. There are so many exceptions, it all breaks down in complexity. Going forward, I will always create a table which contains everything resolved. It will be much faster, just like Intel burns precalculated cosine tables into CPUs for simplicity and speed.

  • I've done something similar for a couple of databases. Because the reporting systems are billing and HR related, the table also contains pre-calculated columns indicating weekday_abbr (Mon - Fri), is_business_day (0 or 1), is_month_business_day_1 (0 or 1), is_holiday (0 or 1), etc. This is one of those table design techniques common in the dimensional modeling realm that can be leveraged in the OLTP and DW realm.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

Viewing 15 posts - 31 through 45 (of 46 total)

You must be logged in to reply to this topic. Login to reply