Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 

Date Rounding Tactics and the Tiny Devil of SMALLDATETIME

Tiny Devils

With every new year I think a little bit about time and dates. This posts looks a little more at that in TSQL.

Rounding Dates: Which way is best?

Sometimes in TSQL you need to round a datetime value to the precision of either a day, hour, minute, or second.

I realized recently that I have a few ways I know how to do this, but I wasn’t sure which was the most efficient.

I did a little searching and didn’t find anything super conclusive. I had a little chat with Jeremiah Peschka (blog | twitter) and he told me which way he thought was fastest and why.

And so I decided to run some tests. Jeremiah has a way of being right about these things, but I had to see for myself.

I’ll go ahead and tell you: He was totally right, and I’ll show you why. But I learned a couple things along the way.

Reference: Types and Storage Sizes

To get started, let’s review some of our friends, the SQL Server datatypes. Hi friends!

Rounding to the Day

The most frequent case in which I need to round dates is to the day level. So instead of ’1/4/2011 6:15:03.393921′, I want just ’1/4/2011′.

SQL 2008′s date type made this a lot easier for everyone– now we can just cast a datetime or datetime2 value as a date, and we’ve got what we need. PLUS, our new value is nice and small, weighing in at 3 bytes.

I think most everyone agrees, we like this!

SELECT CAST('1/1/2010 23:59:59.000' AS DATE) AS [I'm a date!]

Rounding to Hours, Minutes, or Seconds:
Beware the tiny devil of SMALLDATETIME

This is still a bit more complicated. When you start thinking about these and different datatypes, you need to make sure you understand what you mean by rounding.

In SQL Server, our datatypes actually have some different opinions about what rounding means. Check this out:

SELECT
CAST('1/1/2010 23:59:59.000' AS DATETIME) AS [I'm a DATETIME!],
CAST('1/1/2010 23:59:59.000' AS DATETIME2(0))  AS [I'm a DATETIME2(0)!'],
CAST('1/1/2010 23:59:59.000' AS SMALLDATETIME) AS [I'm a SMALLDATETIME, AND I'm very confused.],
CAST('1/1/2010 23:59:59.000' AS DATE) AS [I'm a DATE!]

This returns:

The SMALLDATETIME value rounds this up to January 2nd, instead of January 1. The Date datatype does not.

In considering whether or not to use SMALLDATETIME, you need to learn and establish whether or not to round up for minutes and date values. With a different example, if something occurred at 12:30:31 AM, would that be represented as having  happened in the 12:30 minute, or at 12:31?

Most of us actually want to round down. We want the largest minute number which is less than or equal to the datetime value. This is similar to what FLOOR does for integers. You could also call this truncating the portion of the datetime value you don’t want.  This is not, however, what SMALLDATETIME gives you, so use it with care.

So this is what I’m saying:

Like, seriously, SMALLDATETIME: you are SO messed up.

Comparing Methods of Rounding Dates

So given that warning, let’s actually round some date values, and let’s compare the efficiency of each method.

To start out with, let’s create a table and toss in a bunch of date values. We’ll run queries against these dates and measure SQL Server’s abilities to work with it.

To make up a bunch of datetime data, I’m using my trusty recursive CTE from my prior post.

--Populate a table with some data
CREATE TABLE dbo.Donuts ( DonutTime DATETIME2(7) )
 
DECLARE
@startDate DATETIME2(7)= '2010-12-01 00:00:00' ,
@endDate DATETIME2(7)= '2010-12-11 01:30:00' ;
 
WITH    MyCTE
AS ( SELECT
@startDate AS [Makin' the Donuts]
UNION ALL
SELECT
DATEADD(ms, 1225, [Makin' the Donuts])
FROM
MyCTE
WHERE
[Makin' the Donuts] < @endDate )
INSERT  dbo.Donuts
SELECT
[Makin' the Donuts]
FROM
MyCTE
OPTION
( MAXRECURSION 0 ) ;
 
SELECT @@ROWCOUNT
--We now have 709716 rows of DonutTime

Now let’s look at different methods to manipulate datevalues. For our examples I’ll be rounding to the minute.

Contestant 1 -
DATEPART: isolate each part of the date, then concatenate

As we learn TSQL, this is the first method that occurs to us. We know DATEPART will return part of a date (great name!), so we can chop apart the bits. However, to get them back together properly we have to turn each part into a string to clue them back together. And then if we want to treat it like a date (which we pretty much always do), we have to cast it back.

Just look at this baby. It’s pretty ugly.

SELECT
CAST(CAST(DATEPART(YY, DonutTime) AS CHAR(4)) + '-' + CAST(DATEPART(MM, DonutTime) AS NVARCHAR(2)) + '-'
+ CAST(DATEPART(DD, DonutTime) AS NVARCHAR(2)) + '  ' + CAST(DATEPART(hh, DonutTime) AS NVARCHAR(2)) + ':'
+ CAST(DATEPART(mi, DonutTime) AS NVARCHAR(2)) + ':00.000' AS DATETIME2(0)) AS [Wow, that was a lot of typing.]
FROM
dbo.Donuts

Running this (after cleaning out buffers), I got these results:

Contestant 2 -
Subtracting what you don’t want

There’s a couple of variations on contestant #2. I’ll take the one I like best, which is casting to a smaller byte size by using DATETIME2(0), which is 6 bytes rather than 8 and effectively truncates to the second. Then I’ll subtract the seconds.

SELECT
DATEADD(ss, -DATEPART(ss, DonutTime), CAST (DonutTime AS DATETIME2(0)))
FROM
dbo.Donuts

Running this one (yes, I cleaned out the buffers), I got these results:

Well now, that’s much lower CPU time there.

NB: I did test, and in all my trials it was lower CPU time to cast into DATETIME2 rather than using a nested DATEADD function to subtract milliseconds.

Contestant 3-
Convert to a shorter character string, then back to date

This contestant is near and dear to my heart. I like it because it’s easy for me to remember. You take a short trip into CHAR() with the 121 date format and set the length to chop off the parts of the date you don’t want. Then you cast or convert back to a DATETIME2(0).

I think I like this one because it feels just a little bit violent. But not in a bad way. It’s like roller derby.

SELECT
CAST(CONVERT(CHAR(16), DonutTime, 121) AS DATETIME2(0))
FROM
dbo.Donuts

Oh, sad. This one didn’t do very well. It’s definitely better than Contestant #1, at least.

Contestant 4-
Use DATEADD to calculate the minutes since a given date, then add them back

Here’s the method Jeremiah suggested to me. The way he described it was “Just figure out the number of minutes since the beginning of time, and use that.”

Being a philosophy major, I of course asked “So, when was the beginning of time?”

Being a developer, he answered, “Just call it zero.”

SELECT
DATEADD(mi, DATEDIFF(mi, 0, CAST(DonutTime AS DATETIME2(0))), 0)
FROM
dbo.Donuts

Here are the results (clean buffers, as usual):

Ooo, check out the CPU time on that one.

Note: I ran a few trials and this is faster on the CPU when you cast as DATETIME2(0) before doing your maths. I did that to make all things equal with the other contestants, who had the same benefit.

Who Won, and Why

Here’s a recap of how everyone performed:

Why did contestants 2 and 4 do so well?

Jeremiah pointed out that datetime values are stored internally as two four byte integers. (BOL reference: see “Remarks”) Performing mathematic functions on an integer value is a nice fast activity on the CPU.

Performing conversions back and forth to character based datatypes, however, is not so natural, nor so fast.

What’s the internal storage format of DateTime2?  Well, I’m not sure about that one. BOL isn’t so up-front about these things anymore. If you happen to know, please tell me in the comments. I can tell, however, that it’s something that enjoys mathematics.

Comments

Posted by Skull Killer on 6 January 2011

Hi

Another method is:

DATEADD(MINUTE,FLOOR(CAST(@date AS FLOAT)),0)

Can you try it in your hardware and add to the comparision?

I use this (FLOOR(CAST(@date AS FLOAT))) in DW DBs to generate the date dimension keys.

Posted by SMALLDATETIME owns you on 12 January 2011

I started to read your article but found the childish rant about SMALLDATETIME less than stupid.

If you do not need minute level accuracy then this light weight data type is powerfull.  

If you expect accuracy when down converting any data type to one that is less percise by a factor of two, well then;

YOU ARE SO MESSED UP GIRL, LIKE REALLY, GET A CLUE.

Have you ever helped someone save 25GB in a db file by changing dates stored in a DATETIME feild?

SMALLDATETIME has done that.

Buh Buy..

Posted by Jedak on 12 January 2011

For information about the internal storage of datetime2 see

weblogs.sqlteam.com/.../the-internal-storage-of-a-datetime2-value.aspx

Posted by JJ B on 12 January 2011

Nice!

Posted by WWare on 12 January 2011

I found your, ahem, childish rant to be educational and more than entertaining.  Your ability to share knowledge in a clear fashion with a heavy dose of wit is genius!

Posted by jnmarlin on 12 January 2011

Somebody is really not a morning person. Then again, I am reading this in the afternoon and thoroughly enjoyed your quirky wit.

Posted by Paul Hough on 13 January 2011

Our standard that we showed to be the most efficient for a Data Warehousing instance where we need really only Days and Months:

CONVERT(SMALLDATETIME, DATEADD(DD, DATEDIFF(DD, 0, DonutTime), 0)) AS Dly_Date

and

CONVERT(SMALLDATETIME, DATEADD(MM, DATEDIFF(MM, 0, DonutTime), 0)) AS Mth_Start_Date

On 100 million rows, the only trade-off is the outer conversion to SMALLDATETIME which slows the CPU timing marginally, but without it the result is of type DATETIME. So the trade-off is a saving of about 400 MBytes in the heap and in all nonclustered indexes using these dates. This has a material improvement on IO due to more rows per block / page / IO.

Leave a Comment

Please register or log in to leave a comment.