Calculating Duration Using DATETIME Start and End Dates (SQL Spackle)

  • Thomas Abraham (1/16/2014)


    Eirikur Eiriksson (1/16/2014)


    Thank you for the spackle Jeff but it looks like there is a small crack in it as it makes the assumption that the end time is always greater than the start time. This is the reason why I prefer to use double cast, first to varbinary and then to bigint.

    SELECT

    StartDT

    ,EndDT

    ,Duration = STUFF(CONVERT(VARCHAR(20),EndDT-StartDT,114),1,2,DATEDIFF(hh,0,EndDT-StartDT))

    ,DurationI = STUFF(CONVERT(VARCHAR(20),StartDT-EndDT,114),1,2,DATEDIFF(hh,0,StartDT-EndDT))

    ,cast(cast(StartDT as varbinary(128)) as bigint) As StartTicks

    ,cast(cast(EndDT as varbinary(128)) as bigint) AS EndTicks

    ,cast(cast(StartDT as varbinary(128)) as bigint)-cast(cast(EndDT as varbinary(128)) as bigint) as XmY

    ,cast(cast(EndDT as varbinary(128)) as bigint)-cast(cast(StartDT as varbinary(128)) as bigint) as YmX

    FROM #JBMTest;

    I got:

    Msg 8115, Level 16, State 2, Line 15

    Arithmetic overflow error converting expression to data type datetime.

    when I ran it for date differences greater than about 148 years. Maybe not a problem for some implementations, but something to be aware of.

    The error would be from either the Duration or DurationI as 148 years are well within the approx. 974904000 year limit of the bigint, given there are 300 ticks in a second. The binary-bigint method works on DATETIME and therefore inherits its limits:

    declare @t1 datetime = '9999-12-31 23:59:59.000'

    declare @t2 datetime = '1753-01-01 00:00:00.000'

    SELECT cast(cast(@t1 as varbinary(128)) as bigint)-cast(cast(@t2 as varbinary(128)) as bigint)

  • Dennis Wagner-347763 (1/16/2014)


    The issue with doing simple DATEDIFF in SQL is that it does a straight subtraction. From the example above:

    DATEDIFF(year, '10/16/95', '1/15/14') = 19 (but only 18 full years elapsed)

    DATEDIFF(month, '10/16/95', '1/15/14')%12 = 3 (but only 2 full months elapsed)

    Days is a real mess.

    How about this: DATEDIFF(YEAR, '12/31/13', '1/1/14') = 1! (not the factorial) when only 1 day has elapsed

    Yes, I do see what you mean. See my comment below for a better suggestion.

    When comparing who is older when they achieved their Super Duper Fantastical Grandmaster Status, you have to use number of days because it is not affected by leap years, days in a month, etc.

    So if person A above was compared to person B who was born December 14, 1995 and achieved this noteworthy status on March 15, 2014, then both of them would be 6,666 days old:

    DATEDIFF(day, '10/16/95', '1/15/14') = DATEDIFF(day, '12/14/95', '3/15/14') = 6,666

    However, when formatted as YMD, person A is 18 years, 2 months, 30 days while person B is 18 years, 3 months, 1 day, which would make person A seem younger. My original response states why you can't just fix 30 days in a month or you could end up with x years, 12 months, and 4 days when 364 days elapsed from the previous year anniversary.

    I always use days when figuring out a "top 10" list, but I always format the answer as YMD so the age is meaningful to the audience.

    I was a bit rushed when I responded earlier, and didn't fully communicate what I had in mind when I said, "Which years in the calendar should we use to calculate this?" The solution seems to be to calculate the days for comparison purposes. Then, add the days to a standard reference date, such as 1900-01-01. THEN do your DATEDIFF calculations. That way your 6,666 days will always work out to the same number of years, months & days, no matter what period of time they actually occurred over. The leaps years and months of variable length will always fall in the same spots, so days added to the reference date will always give a determinate result, which I believe is what you require.

    [font="Verdana"]Please don't go. The drones need you. They look up to you.[/font]
    Connect to me on LinkedIn

  • The solution seems to be to calculate the days for comparison purposes. Then, add the days to a standard reference date, such as 1900-01-01. THEN do your DATEDIFF calculations. That way your 6,666 days will always work out to the same number of years, months & days, no matter what period of time they actually occurred over. The leaps years and months of variable length will always fall in the same spots, so days added to the reference date will always give a determinate result, which I believe is what you require.

    Thomas, I like the way you are thinking. My solution was to build a function that would use the elapsed days to do the formatting.

    While I agree that using a reference table would give a consistent determinate result, the issue would be how do we build that reference table. We can't just use 30 days in a month, or we'll run out of months before we run out of days in a year. If we assume like a calendar year that months 1,3,5,7,8,10, and 12 have 31 days, months 4,6,9, and 11 have 30 days and month 2 has either 28 or 29 days, then we'll get the correct number of days and every month of variable length will fall in the same spot. An example would be the standard fiscal week calendar for time shares where February always has 5 weeks every year. That just seems counterintuitive since only 1/4 of the time does it ever have more than 4.

    The issue then becomes presentation and acceptance. If I compare someone born on February 15th with a date on March 16th, using a standard reference table that assumes the first month has 31 days, I should get x years, 0 months, and 29 days. While that answer is absolutely correct, the presentation can get sticky when someone looks at the birth date and the achievement date and argues that should be x years 1 month and 1 day. If you have to write more than 3 words to explain it on your presentation, you've lost your argument!

    How it's presented is part of the issue when the audience requires that it be formatted in YMD. Going back to my original question though: I have a function that is somewhat messy to do the formatting. I was wondering if anyone had a simple elegant solution they've used to do the formatting.

  • I meant to post this earlier, but got caught up in some of the tangent. Thanks to Jeff for providing this spackle. Great job, as always.

    [font="Verdana"]Please don't go. The drones need you. They look up to you.[/font]
    Connect to me on LinkedIn

  • Dennis Wagner-347763 (1/16/2014)

    Thomas, I like the way you are thinking. My solution was to build a function that would use the elapsed days to do the formatting.

    While I agree that using a reference table would give a consistent determinate result, the issue would be how do we build that reference table.

    That's why I'd use a reference DATE, instead of a table. No construction required.

    We can't just use 30 days in a month, or we'll run out of months before we run out of days in a year.

    There are applications where this IS done. Ever seen financial calculators for loans? They use standard 30 day months. But, that's a different animal I suppose.

    The issue then becomes presentation and acceptance. If I compare someone born on February 15th with a date on March 16th, using a standard reference table that assumes the first month has 31 days, I should get x years, 0 months, and 29 days. While that answer is absolutely correct, the presentation can get sticky when someone looks at the birth date and the achievement date and argues that should be x years 1 month and 1 day. If you have to write more than 3 words to explain it on your presentation, you've lost your argument!

    How it's presented is part of the issue when the audience requires that it be formatted in YMD. Going back to my original question though: I have a function that is somewhat messy to do the formatting. I was wondering if anyone had a simple elegant solution they've used to do the formatting.

    Seems like you only have two choices here:

    1. Recognize that there is no simple elegant solution when dealing with our calendar, which is neither simple nor elegant. This means you have to do a modified version of the three DATEDIFFs, which I imagine is what you are already doing, e.g. "years = DATEDIFF(yy, datea, dateb), less 1 if month and day of dateb falls before month and day of datea".

    2. Make a single simplifying assumption (a reference date) that makes all time periods comparable, regardless of when they occur, realizing that there may be some disagreement with the results.

    Both solutions have drawbacks. Which you choose depends on which drawbacks you find acceptable.

    [font="Verdana"]Please don't go. The drones need you. They look up to you.[/font]
    Connect to me on LinkedIn

  • I ran into this in the last job...

    Calculate the duration between start datetime and end datetime for KPI reporting, factor in:

    - Business Days only (no holidays)

    - Business Hours (sites have different effective hours)

    - Time Zones (15 sites around the world...with different holidays)

    "How hard can it be?" they asked.

    Sheesh.

    -d

  • dcowdery (1/16/2014)


    I ran into this in the last job...

    Calculate the duration between start datetime and end datetime for KPI reporting, factor in:

    - Business Days only (no holidays)

    - Business Hours (sites have different effective hours)

    - Time Zones (15 sites around the world...with different holidays)

    "How hard can it be?" they asked.

    Sheesh.

    -d

    Two words: justifiable homicide

    [font="Verdana"]Please don't go. The drones need you. They look up to you.[/font]
    Connect to me on LinkedIn

  • [h4]Try this...[/h4]

    Create Function dbo.ElapsedTime

    (

    @StartTime DateTime,

    @EndTime DateTime

    )

    Returns VarChar(20)

    As

    Begin

    Declare

    @st DateTime,

    @et DateTime,

    @NeedOffset bit;

    -- Reality check!

    If (@StartTime>@EndTime) Return Null;

    Select

    -- Get isolated time components for start and end DateTime inputs.

    @st='1900-01-01 '+Right(Convert(VarChar,@StartTime,121),12),

    @et='1900-01-01 '+Right(Convert(VarChar,@EndTime,121),12),

    -- If there is a time inversion set a flag indicating that offsetting is required.

    @NeedOffset=Case when @st>@et then 1 else 0 End;

    -- The DateTime data-type can represent dates between 1753-01-01 00:00:00.000 and

    -- 9999-12-31 23:59:59.998, inclusive. However, DateDiff(ms,...) will throw an Int

    -- overflow exception if the differential is close to 25 days, therefore, this method

    -- processes the days component separately from the time component.

    Return

    -- Output the DateTime days differential (0 - 3012153).

    Cast

    (

    DateDiff(dd,@StartTime,@EndTime)-

    -- Apply day offset, if necessary.

    Case @NeedOffset when 1 then 1 else 0 End

    as VarChar

    )+

    ':'+

    -- Output the DateTime time differential (00:00:00.000 - 23:59:59.999).

    -- This is done by replacing both input value's date component with

    -- '1900-01-01' before running the appropriate conversion functions.

    Right

    (

    Convert

    (

    VarChar,

    DateAdd

    (

    ms,

    DateDiff

    (

    ms,

    @st,

    @et

    )+

    -- Handle milliseconds offset, if necessary.

    Case @NeedOffset when 1 then 86400000 else 0 End,

    0

    ),

    121

    ),

    12

    );

    End



    PeteK
    I have CDO. It's like OCD but all the letters are in alphabetical order... as they should be.

  • dcowdery (1/16/2014)


    I ran into this in the last job...

    Calculate the duration between start datetime and end datetime for KPI reporting, factor in:

    - Business Days only (no holidays)

    - Business Hours (sites have different effective hours)

    - Time Zones (15 sites around the world...with different holidays)

    "How hard can it be?" they asked.

    Sheesh.

    -d

    I share your pain on this, although somewhat off topic, normally one handles this with a calendar table.:cool:

  • Dennis Wagner-347763 (1/16/2014)


    The solution seems to be to calculate the days for comparison purposes. Then, add the days to a standard reference date, such as 1900-01-01. THEN do your DATEDIFF calculations. That way your 6,666 days will always work out to the same number of years, months & days, no matter what period of time they actually occurred over. The leaps years and months of variable length will always fall in the same spots, so days added to the reference date will always give a determinate result, which I believe is what you require.

    Thomas, I like the way you are thinking. My solution was to build a function that would use the elapsed days to do the formatting.

    While I agree that using a reference table would give a consistent determinate result, the issue would be how do we build that reference table. We can't just use 30 days in a month, or we'll run out of months before we run out of days in a year. If we assume like a calendar year that months 1,3,5,7,8,10, and 12 have 31 days, months 4,6,9, and 11 have 30 days and month 2 has either 28 or 29 days, then we'll get the correct number of days and every month of variable length will fall in the same spot. An example would be the standard fiscal week calendar for time shares where February always has 5 weeks every year. That just seems counterintuitive since only 1/4 of the time does it ever have more than 4.

    The issue then becomes presentation and acceptance. If I compare someone born on February 15th with a date on March 16th, using a standard reference table that assumes the first month has 31 days, I should get x years, 0 months, and 29 days. While that answer is absolutely correct, the presentation can get sticky when someone looks at the birth date and the achievement date and argues that should be x years 1 month and 1 day. If you have to write more than 3 words to explain it on your presentation, you've lost your argument!

    How it's presented is part of the issue when the audience requires that it be formatted in YMD. Going back to my original question though: I have a function that is somewhat messy to do the formatting. I was wondering if anyone had a simple elegant solution they've used to do the formatting.

    When you use the method I suggested, you don't have to worry about any of this. Once you have "normalized" your period by adding the day count to the reference date, then your presentation calculation becomes something like (assuming your NORMALIZED DATE = 1900-01-01 + day count):

    YYYY = DATEPART(YYYY, NORMALIZED DATE) - 1900

    MM = DATEPART(MM, NORMALIZED DATE)

    DD = DATEPART(DD, NORMALIZED DATE)

    Hope this helps in some way.

    [font="Verdana"]Please don't go. The drones need you. They look up to you.[/font]
    Connect to me on LinkedIn

  • Thank you for article, enjoyed the read!

  • I must say, I love the technique Jeff! Very simple, straightforward and clean with good performance. Nothing more straightforward comes immediately to mind (issues noted above aside).

    Normally I take your word on performance but I in this case I played a bit and found something interesting. Running your 1M row test harness, followed by the below query:

    SET STATISTICS TIME ON;

    SELECT @Bitbucket = STUFF(CONVERT(VARCHAR(20),s,114),1,2,DATEDIFF(hh,0,s))

    FROM #JBMTest

    CROSS APPLY (SELECT EndDT-StartDT) a (s);

    SET STATISTICS TIME OFF;

    Generated different speed results on SQL 2008 R2 vs. SQL 2012.

    SQL 2008 R2 (expected - second result uses CA for intermediate calculation):

    (1000000 row(s) affected)

    SQL Server Execution Times:

    CPU time = 702 ms, elapsed time = 699 ms.

    SQL Server Execution Times:

    CPU time = 717 ms, elapsed time = 707 ms.

    SQL 2012 (unexpected):

    (1000000 row(s) affected)

    SQL Server Execution Times:

    CPU time = 749 ms, elapsed time = 738 ms.

    SQL Server Execution Times:

    CPU time = 671 ms, elapsed time = 674 ms.

    Improvements in the optimizer I reckon.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Eirikur Eiriksson (1/16/2014)


    Thank you for the spackle Jeff but it looks like there is a small crack in it as it makes the assumption that the end time is always greater than the start time. This is the reason why I prefer to use double cast, first to varbinary and then to bigint.

    SELECT

    StartDT

    ,EndDT

    ,Duration = STUFF(CONVERT(VARCHAR(20),EndDT-StartDT,114),1,2,DATEDIFF(hh,0,EndDT-StartDT))

    ,DurationI = STUFF(CONVERT(VARCHAR(20),StartDT-EndDT,114),1,2,DATEDIFF(hh,0,StartDT-EndDT))

    ,cast(cast(StartDT as varbinary(128)) as bigint) As StartTicks

    ,cast(cast(EndDT as varbinary(128)) as bigint) AS EndTicks

    ,cast(cast(StartDT as varbinary(128)) as bigint)-cast(cast(EndDT as varbinary(128)) as bigint) as XmY

    ,cast(cast(EndDT as varbinary(128)) as bigint)-cast(cast(StartDT as varbinary(128)) as bigint) as YmX

    FROM #JBMTest;

    You're correct and I could have added a simple statement, such as "Assuming two non-null dates where the StartDT is less than or equal to the EndDT,...", to make sure that no one would have to assume.

    Shifting gears to the code you posted...

    While I very much appreciate alternate methods for just about everything, I have to admit that the code you wrote for the StartTicks and EndTicks columns has me confused. Why are they labeled as "Ticks"? Because DATETIME is based on two integers where one integer represents the number of whole days and the other represents the number of 1/300ths of a second that have passed since midnight, I don't understand where the word "Ticks" comes into play. Without further explanation or esoteric knowledge, one might think that you were implying "clock ticks" and, without some additional calculation, they are not. Further, since the maximum hex value of the "Time" portion for a DATETIME datatype is 018B81FF (23:59:59.997) and falls drastically short of the FFFFFFFF value before the rollover to the next int (day), the VARBINARY/BIGINT converted value can't represent "ticks" of equal duration of any kind. A simple conversion of DATETIME to FLOAT would get you a lot closer to some form of "ticks" value.

    This is the reason why I prefer to use double cast, first to varbinary and then to bigint.

    I'm even more confused after that statement. Please explain how your code, as it is posted, ensures that the StartDT and EndDT values are in the proper order even if they were to produce the required human-readable output (and it IS required by most). From what I can see, the code doesn't resolve that problem. Unless I'm missing something, it simply presents two columns where an additional decision would still need to be made as to whether or not the two DATETIMEs were in the correct order or not.

    And, of course, although most would never approach the limit and as you have yourself recognized, your conversions do have a limit at 148 years whereas the code in the article does not.

    I must be missing something. Please explain.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thomas Abraham (1/16/2014)


    Oliver Gugolz (1/16/2014)


    Here is my simple method to format the output to a more readable format:

    CAST(GETDATE() - @StartDT AS TIME(2)) AS Duration

    But that only displays the fractional hours portion of the difference. You still lose the days converted to hours that Jeff's method preserves.

    Absolutely correct. Thanks for the cover on that one, Thomas.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • mayor 76640 (1/16/2014)


    My specific use of the datetime field is to see if it occurs

    within a user specified date range.

    The examples I received with the stored procedures (from software application vendor)

    had the user enter full date and time (2014-01-14 23:59.59)when I am not interested in the time aspect.

    I changed to have the user just enter the date ('2014-01-14') which defaults to zero time.

    If inclusive start date and end date are entered, with the default zero time, I would miss

    the last date, so I compare

    between @startdate and @enddate + 1

    which does include the entire enddate.

    Thank you for the feedback but how does that pertain to the article?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 16 through 30 (of 57 total)

You must be logged in to reply to this topic. Login to reply