Normalizing Dates

,

Normalizing Dates

Typically, in past projects, I have used datetime columns to correlate some date/time value with a transaction or entity's creation and modification.

For instance, if a customer where to place an order, I would insert a record into an Order table with the order number and a date/time for when the order was generated.

In another example, if I needed to dynamically schedule sending emails to customers on given dates, I would insert a date/time along with a customer id

into an EmailCustomer table to identify when to send the emails.

In a current project, I decided to experiment with how I worked with dates a bit. I had the idea: 'What if I normalized my dates?' Essentially, using some ideas from OLAP and

implementing them in an OLTP environment. What was I thinking?! Why would I do something like that?! Read on...

This project, I'll call 'project X' for the sake of this article is a heavy data collection and reporting application. Initially, I was looking at OLAP and Data Warehousing as an option for the reporting. In the end, for many non-technical reasons, OLAP was not the direction we took, but it did provide an introductory glance at cubes, dimensions and fact tables. As an OLAP newbie, I got started by stepping through the Analysis Services tutorial. (As an aside, the Analysis Services tutorial from a beginners standpoint is quite good. The only area I found missing was the methodology and best practices for creating fact tables.)

In an effort to get up to speed on the world of OLAP, I wrote a stored procedure to build a Time dimension for a cube. My time dimension would include, the date, hour of day, day of year, month of year, year, fiscal year, quarter, and several additional date/time groupings. After testing, I created a SQL job to automated the stored procedure to run daily and append the new day's date/time information to my dimension.

Once I learned that OLAP was not going to be an option, I put down the Analysis Services tutorial and continued in another direction. But soon after, I realized that I could leverage my Time dimension within my normalized, OLTP database. I added a few additional date/time attributes to the Time dimension and modified the stored procedure to pick up were it left off, even if it failed to run for some amount of time. In addition, I added the ability for the dates to forecast a year into the future.

Here's a look at the stored procedure:

CREATE PROCEDURE dbo.UpdateTimeTable

AS

Declare @LastDate smalldatetime

Declare @StartDate smalldatetime

Declare @CurrentDate smalldatetime

Declare @Day varchar(20)

Declare @Month varchar(20)

Declare @Year char(4)

Declare @MinuteOfHour tinyint

Declare @HourOfDay tinyint

Declare @DayOfMonth tinyint

Declare @DayOfYear smallint

Declare @MonthOfYear tinyint

Declare @WeekOfYear tinyint

Declare @Quarter tinyint

Declare @FiscalYear varchar(20)

set nocount on

--see if there are any records in the table

if exists(select top 1 thedate from timetable with (nolock) order by thedate desc)

    begin

       --get last date from table

       select top 1 @LastDate = thedate from timetable with (nolock) order by thedate desc

       --set start date

       set @StartDate = dateadd(dd, 1, @LastDate)

    end

else

    begin

       --set default start date

       set @StartDate = cast('1/1/1985' as smalldatetime)

    end

--set current date equal to the start date

set @CurrentDate = @StartDate

--while the current date is less than or equal to today's date + 1 year

while (datediff(dd,@CurrentDate, dateadd(yy, 1, getdate())) >= 0)

    begin

       --set minute default

       set @MinuteOfHour = 15

       --set hour default

       set @HourOfDay = 1

       --get day

       set @Day = datename(weekday,@CurrentDate)

       --get month

       set @Month = datename(month,@CurrentDate)

       --get year

       set @Year = datepart(yyyy,@CurrentDate)

       --get day of month

       set @DayOfMonth = datepart(dd, @CurrentDate)

       --get day of year

       set @DayOfYear = datepart(dy,@CurrentDate)

       --get week of year

       set @WeekOfYear = datepart(wk,@CurrentDate)

       --get month of year

       set @MonthOfYear = datepart(mm,@CurrentDate)

       --get quarter

       set @Quarter = datepart(qq,@CurrentDate)

       --get fiscal year

       if @MonthOfYear > 9

          begin

             set @FiscalYear = @Year + 1

          end

       else

          begin

             set @FiscalYear = @Year

          end

       --generate hours for day

       while (@HourOfDay < 25)        begin           --generate 15 minute increments for each hour           while @MinuteOfHour < 61              begin                 --insert record into timetable                 insert into timetable                    (thedate, theday, themonth, theyear, minuteofhour, hourofday, dayofmonth, dayofyear,
                   weekofyear, monthofyear, [quarter], fiscalyear)

                values

                   (@CurrentDate, @Day, @Month, @Year, @MinuteOfHour, @HourOfDay, @DayOfMonth,

                   @DayOfYear, @WeekOfYear, @MonthOfYear, @Quarter, @FiscalYear)

                --increment by 15 minutes

                set @MinuteOfHour = @MinuteOfHour + 15

             end

          --reset minutes for next hour

          set @MinuteOfHour = 15

          --increment hour by 1

          set @HourOfDay = @HourOfDay + 1

       end

       --increment current date by 1

       set @CurrentDate = dateadd(dd,1,@CurrentDate)

    end

GO

Here's a look at the Time table:

TimeID TheDate TheDay TheMonth TheYear MinuteOfHour HourOfDay DayOfMonth DayOfYear WeekOfYear MonthOfYear Quarter FiscalYear
351109 1985-01-01 Tuesday January 1985 15 1 1 1 1 1 1 1985
351110 1985-01-01 Tuesday January 1985 30 1 1 1 1 1 1 1985
351111 1985-01-01 Tuesday January 1985 45 1 1 1 1 1 1 1985
351112 1985-01-01 Tuesday January 1985 60 1 1 1 1 1 1 1985
351113 1985-01-01 Tuesday January 1985 15 2 1 1 1 1 1 1985
351114 1985-01-01 Tuesday January 1985 30 2 1 1 1 1 1 1985
351115 1985-01-01 Tuesday January 1985 45 2 1 1 1 1 1 1985
351116 1985-01-01 Tuesday January 1985 60 2 1 1 1 1 1 1985
table continued…
TimeID TheDate TheDay TheMonth TheYear MinuteOfHour HourOfDay DayOfMonth DayOfYear WeekOfYear MonthOfYear Quarter FiscalYear
1040101 2004-08-26 Thursday August 2004 15 1 26 239 35 8 3 2004
1040102 2004-08-26 Thursday August 2004 30 1 26 239 35 8 3 2004
1040103 2004-08-26 Thursday August 2004 45 1 26 239 35 8 3 2004
1040104 2004-08-26 Thursday August 2004 60 1 26 239 35 8 3 2004
1040105 2004-08-26 Thursday August 2004 15 2 26 239 35 8 3 2004
1040106 2004-08-26 Thursday August 2004 30 2 26 239 35 8 3 2004
1040107 2004-08-26 Thursday August 2004 45 2 26 239 35 8 3 2004
1040108 2004-08-26 Thursday August 2004 60 2 26 239 35 8 3 2004

Currently, the Time table has 689,088 rows. This spans 15 minute intervals from 1/1/1985 - 8/26/2004. Daily, the table would increase by 96 rows. As you may guess, this is not ideal for performance when doing joins to determine when something has happened. To resolve this, I created several indexed views to allow me the ease of querying and additional reporting leverage. By doing so, I can see one of many snapshots in time.

Here are a couple of the views:

View vwTimeDateOnly:
TimeID TheDate
351109 1/1/1985
351205 1/2/1985
351301 1/3/1985
351397 1/4/1985
351493 1/5/1985
351589 1/6/1985
351685 1/7/1985
351781 1/8/1985
351877 1/9/1985
351973 1/10/1985
vwTimeDateOnly definition:
CREATE VIEW dbo.vwTimeDateOnly

AS

SELECT MIN(TimeID) AS TimeID, TheDate

FROM dbo.TimeTable

GROUP BY TheDate


View vwTimeNoHourNoDay:
TimeID TheMonth TheYear MonthOfYear Quarter FiscalYear
351109 January 1985 1 1 1985
354085 February 1985 2 1 1985
356773 March 1985 3 1 1985
359749 April 1985 4 2 1985
362629 May 1985 5 2 1985
365605 June 1985 6 2 1985
vwTimeNoHourNoDay definition:
CREATE VIEW dbo.vwTimeNoHourNoDay

AS

SELECT MIN(TimeID) AS TimeID, TheMonth, TheYear, MonthOfYear, Quarter, FiscalYear

FROM dbo.TimeTable

GROUP BY TheMonth, TheYear, MonthOfYear, Quarter, FiscalYear

Now that I had my Time table and views, I could easily normalize out certain actions based in time in my database. For example:

In Project X, electric meters are read on certain intervals (15 minute, monthly, etc.). If a meter was read monthly, the meter reading dates could span from 1/25/2003 - 2/20/2003. For reporting and accounting purposes, the readings for the above date range need to be applied to a single month, in this instance; it would be February, 2003. Using a month/year view, I simply join the table where meter readings reside to my Time table. This linked my meter reading to a specific month and year. It also linked my meter reading to a specific fiscal year and quarter. I used this same logic to associate monthly electric invoices with a specific month and year.

Because the Time table contained dates up to one year in the future, it also allowed me to forecast energy consumption in the same way as meter readings and invoices. It also allowed me to normalize future electric rates that were posted to us prior to their effective dates.

For reporting purposes, I could now easily query based on fiscal year, quarter or even 15 minute intervals. It allowed me large amounts of flexibility without having to write long winded stored procedures to group and aggregate data by certain date ranges, etc. My time table has done all of the work for me up front.

This project has yet to reach completion, so my perspective and scheme for attacking date/time issues within the application may change. Tell me what you think.

For articles and information on SQL’s Analysis Services and OLAP click here:

For articles and information on SQL’s date parsing functions click here:

Rate

Share

Share

Rate