Grouping timespans by the avg time per weekday

  • Hi There

    Long time - no see. I would appreciate a bit of help on a query I have to write.

    I work for port authorities and maintain the (legacy: Access frontend) program and DB (SS2K) that calulates the port fees etc.

    Out of our data I need to draw a statistik on the avg time certain shiptypes lie in certain berths and if possible - I also need the avg duration per weekday.

    I've already worked out how to get the avg duration itsself, but per weekday blows me.

    Here for some testdata (well actually its an excerpt of real data):

    CREATE TABLE #TestData(

    [ID] [int] NOT NULL,

    [shiptype] [varchar](40) NULL,

    [arrival] [datetime] NULL,

    [departure] [datetime] NULL,

    [berthID] [char](32) NULL

    )

    GO

    INSERT INTO #TestData (ID ,shiptype ,arrival ,departure ,berthID) VALUES (72264,'Stückgutschiff', CONVERT(datetime, '2012-04-22 11:30:21' ,120),CONVERT(datetime, '2012-04-22 17:00:25', 120),'0185bc14c26359930053c8b73bc908bd')

    INSERT INTO #TestData (ID ,shiptype ,arrival ,departure ,berthID) VALUES (72265,'Frachtbarge', CONVERT(datetime, '2012-04-22 11:30:14' ,120),CONVERT(datetime, '2012-04-22 17:00:35', 120),'0185bc14c26359930053c8b73bc908bd')

    INSERT INTO #TestData (ID ,shiptype ,arrival ,departure ,berthID) VALUES (72542,'Segelschiff', CONVERT(datetime, '2012-04-27 20:20:53' ,120),CONVERT(datetime, '2012-04-29 22:10:29', 120),'0185bc14c26359930053c8b73bc908bd')

    INSERT INTO #TestData (ID ,shiptype ,arrival ,departure ,berthID) VALUES (73910,'Segelschiff', CONVERT(datetime, '2012-05-22 08:00:00' ,120),CONVERT(datetime, '2012-05-22 16:00:35', 120),'0185bc14c26359930053c8b73bc908bd')

    INSERT INTO #TestData (ID ,shiptype ,arrival ,departure ,berthID) VALUES (74608,'sonstige Spezialschiffe', CONVERT(datetime, '2012-06-05 18:20:23' ,120),CONVERT(datetime, '2012-06-06 09:40:24', 120),'0185bc14c26359930053c8b73bc908bd')

    INSERT INTO #TestData (ID ,shiptype ,arrival ,departure ,berthID) VALUES (75655,'Stückgutschiff', CONVERT(datetime, '2012-06-22 08:30:33' ,120),CONVERT(datetime, '2012-06-22 09:45:20', 120),'0185bc14c26359930053c8b73bc908bd')

    INSERT INTO #TestData (ID ,shiptype ,arrival ,departure ,berthID) VALUES (76423,'Segelschiff', CONVERT(datetime, '2012-07-06 11:15:05' ,120),CONVERT(datetime, '2012-07-08 16:55:07', 120),'0185bc14c26359930053c8b73bc908bd')

    INSERT INTO #TestData (ID ,shiptype ,arrival ,departure ,berthID) VALUES (76874,'sonstige Spezialschiffe', CONVERT(datetime, '2012-07-16 14:30:23' ,120),CONVERT(datetime, '2012-07-17 11:00:01', 120),'0185bc14c26359930053c8b73bc908bd')

    INSERT INTO #TestData (ID ,shiptype ,arrival ,departure ,berthID) VALUES (79277,'Segelschiff', CONVERT(datetime, '2012-09-01 10:40:35' ,120),CONVERT(datetime, '2012-09-03 18:00:29', 120),'0185bc14c26359930053c8b73bc908bd')

    INSERT INTO #TestData (ID ,shiptype ,arrival ,departure ,berthID) VALUES (84292,'Containerschiff', CONVERT(datetime, '2012-12-06 13:45:30' ,120),CONVERT(datetime, '2012-12-07 12:00:41', 120),'0185bc14c26359930053c8b73bc908bd')

    INSERT INTO #TestData (ID ,shiptype ,arrival ,departure ,berthID) VALUES (67410,'Stückgutschiff', CONVERT(datetime, '2012-01-19 07:50:26' ,120),CONVERT(datetime, '2012-01-19 09:10:26', 120),'083af37ac2635993011c8e1080a1fba5')

    INSERT INTO #TestData (ID ,shiptype ,arrival ,departure ,berthID) VALUES (68184,'Forschungsschiff', CONVERT(datetime, '2012-02-03 13:30:09' ,120),CONVERT(datetime, '2012-02-28 16:15:32', 120),'083af37ac2635993011c8e1080a1fba5')

    INSERT INTO #TestData (ID ,shiptype ,arrival ,departure ,berthID) VALUES (69040,'Fischfrachter', CONVERT(datetime, '2012-02-21 10:30:31' ,120),CONVERT(datetime, '2012-05-02 23:00:09', 120),'083af37ac2635993011c8e1080a1fba5')

    INSERT INTO #TestData (ID ,shiptype ,arrival ,departure ,berthID) VALUES (71385,'Deckladungsponton', CONVERT(datetime, '2012-04-09 18:00:15' ,120),CONVERT(datetime, '2012-04-10 15:00:00', 120),'083af37ac2635993011c8e1080a1fba5')

    INSERT INTO #TestData (ID ,shiptype ,arrival ,departure ,berthID) VALUES (74602,'Forschungsschiff', CONVERT(datetime, '2012-06-05 06:30:15' ,120),CONVERT(datetime, '2012-06-11 10:00:33', 120),'083af37ac2635993011c8e1080a1fba5')

    I know that I can get the duration by using timediff and when grouping by berthID and shipstype get the avg overall duration.

    what I need:

    the data grouped by berthID, shiptype and then the avg duration per weekday

    the data grouped by berthID, shiptype and the avg count of ships per weekday (over all the days they lay in berth , not just the arrivaldate or departuredate)

    I have my SS2K8 on my dev environment with a linked server pointing to the SS2K.

    I've read about CTE calandar tables but I just cant get my head round it.

    I hope I've explained the problem well enough -

    Thanks in Advance for any responses

    nano

  • Does this look like it's close?

    ;WITH Tally (n) AS (

    SELECT 0 UNION ALL

    SELECT TOP (

    SELECT MAX(DATEDIFF(day, DATEADD(day, DATEDIFF(day, 0, [arrival]), 0)

    ,DATEADD(day, DATEDIFF(day, 0, [departure]), 0)))

    FROM #TestData

    )

    ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

    FROM sys.all_columns)

    SELECT [ID],

    [shiptype],

    [dayofweek]=DATEPART(weekday, [date]),

    [AvgDurationInMinutes]=AVG(CASE

    WHEN DATEADD(day, DATEDIFF(day, 0, [arrival]), 0) = DATEADD(day, DATEDIFF(day, 0, [departure]), 0)

    THEN DATEDIFF(minute, arrival, departure)

    WHEN DATEADD(day, DATEDIFF(day, 0, [arrival]), 0) = a.[date]

    THEN DATEDIFF(minute, arrival, a.[date]+1)

    WHEN DATEADD(day, DATEDIFF(day, 0, [departure]), 0) = a.[date]

    THEN DATEDIFF(minute, a.[date], departure)

    ELSE 24*60 END),

    [CountOfShips]=COUNT(ID)

    FROM #TestData

    CROSS APPLY (

    SELECT DATEADD(day, n, arrival)

    FROM Tally

    WHERE n BETWEEN 0 AND DATEDIFF(day, DATEADD(day, DATEDIFF(day, 0, [arrival]), 0)

    ,DATEADD(day, DATEDIFF(day, 0, [departure]), 0))) a([date])

    GROUP BY [ID], [shiptype], DATEPART(weekday, [date])

    ORDER BY [ID], [shiptype], [dayofweek]

    Tally is like the CTE calendar table you described except it just returns an integer from 0 to n (number of days) based on arrival/departure of the ship. The actual dates are generated in the CROSS APPLY, then these are used to establish weekday and avg/counts.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Hi Dwain

    First of all - sorry for the late response - I was off work for a long weekend and my RDP was playing up, so I couldn't get on to my server to check.

    From What I can see sofar that looks like what I need - Have yet to verify the results. Thank you so much for your help

    nano

  • You're welcome!


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Hi again

    I had the chance to verify the results, and while the count up is correct - the avg isnt. I have taken the ships, that lay in berth more than 7 Days out of the test data - so heres the new Test Data:

    CREATE TABLE #TestData(

    [ID] [int] NOT NULL,

    [shiptype] [varchar](40) NULL,

    [arrival] [datetime] NULL,

    [departure] [datetime] NULL,

    [berthID] [char](32) NULL

    )

    GO

    INSERT INTO #TestData (ID ,shiptype ,arrival ,departure ,berthID) VALUES (72264,'Stückgutschiff', CONVERT(datetime, '2012-04-22 11:30:21' ,120),CONVERT(datetime, '2012-04-22 17:00:25', 120),'0185bc14c26359930053c8b73bc908bd')

    INSERT INTO #TestData (ID ,shiptype ,arrival ,departure ,berthID) VALUES (72265,'Frachtbarge', CONVERT(datetime, '2012-04-22 11:30:14' ,120),CONVERT(datetime, '2012-04-22 17:00:35', 120),'0185bc14c26359930053c8b73bc908bd')

    INSERT INTO #TestData (ID ,shiptype ,arrival ,departure ,berthID) VALUES (72542,'Segelschiff', CONVERT(datetime, '2012-04-27 20:20:53' ,120),CONVERT(datetime, '2012-04-29 22:10:29', 120),'0185bc14c26359930053c8b73bc908bd')

    INSERT INTO #TestData (ID ,shiptype ,arrival ,departure ,berthID) VALUES (73910,'Segelschiff', CONVERT(datetime, '2012-05-22 08:00:00' ,120),CONVERT(datetime, '2012-05-22 16:00:35', 120),'0185bc14c26359930053c8b73bc908bd')

    INSERT INTO #TestData (ID ,shiptype ,arrival ,departure ,berthID) VALUES (74608,'sonstige Spezialschiffe', CONVERT(datetime, '2012-06-05 18:20:23' ,120),CONVERT(datetime, '2012-06-06 09:40:24', 120),'0185bc14c26359930053c8b73bc908bd')

    INSERT INTO #TestData (ID ,shiptype ,arrival ,departure ,berthID) VALUES (75655,'Stückgutschiff', CONVERT(datetime, '2012-06-22 08:30:33' ,120),CONVERT(datetime, '2012-06-22 09:45:20', 120),'0185bc14c26359930053c8b73bc908bd')

    INSERT INTO #TestData (ID ,shiptype ,arrival ,departure ,berthID) VALUES (76423,'Segelschiff', CONVERT(datetime, '2012-07-06 11:15:05' ,120),CONVERT(datetime, '2012-07-08 16:55:07', 120),'0185bc14c26359930053c8b73bc908bd')

    INSERT INTO #TestData (ID ,shiptype ,arrival ,departure ,berthID) VALUES (76874,'sonstige Spezialschiffe', CONVERT(datetime, '2012-07-16 14:30:23' ,120),CONVERT(datetime, '2012-07-17 11:00:01', 120),'0185bc14c26359930053c8b73bc908bd')

    INSERT INTO #TestData (ID ,shiptype ,arrival ,departure ,berthID) VALUES (79277,'Segelschiff', CONVERT(datetime, '2012-09-01 10:40:35' ,120),CONVERT(datetime, '2012-09-03 18:00:29', 120),'0185bc14c26359930053c8b73bc908bd')

    INSERT INTO #TestData (ID ,shiptype ,arrival ,departure ,berthID) VALUES (84292,'Containerschiff', CONVERT(datetime, '2012-12-06 13:45:30' ,120),CONVERT(datetime, '2012-12-07 12:00:41', 120),'0185bc14c26359930053c8b73bc908bd')

    INSERT INTO #TestData (ID ,shiptype ,arrival ,departure ,berthID) VALUES (67410,'Stückgutschiff', CONVERT(datetime, '2012-01-19 07:50:26' ,120),CONVERT(datetime, '2012-01-19 09:10:26', 120),'083af37ac2635993011c8e1080a1fba5')

    INSERT INTO #TestData (ID ,shiptype ,arrival ,departure ,berthID) VALUES (71385,'Deckladungsponton', CONVERT(datetime, '2012-04-09 18:00:15' ,120),CONVERT(datetime, '2012-04-10 15:00:00', 120),'083af37ac2635993011c8e1080a1fba5')

    INSERT INTO #TestData (ID ,shiptype ,arrival ,departure ,berthID) VALUES (74602,'Forschungsschiff', CONVERT(datetime, '2012-06-05 06:30:15' ,120),CONVERT(datetime, '2012-06-11 10:00:33', 120),'083af37ac2635993011c8e1080a1fba5')

    Changed [ID] (as this is the movement ID) in the SELECT and GROUP BY to [berthID] so using the following code:

    ;WITH Tally (n) AS (

    SELECT 0 UNION ALL

    SELECT TOP (

    SELECT MAX(DATEDIFF(day, DATEADD(day, DATEDIFF(day, 0, [arrival]), 0)

    ,DATEADD(day, DATEDIFF(day, 0, [departure]), 0)))

    FROM #TestData

    )

    ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

    FROM sys.all_columns)

    SELECT [berthId],

    [shiptype],

    [dayofweek]=DATEPART(weekday, [date]),

    [AvgDurationInMinutes]=AVG(CASE

    WHEN DATEADD(day, DATEDIFF(day, 0, [arrival]), 0) = DATEADD(day, DATEDIFF(day, 0, [departure]), 0)

    THEN DATEDIFF(minute, arrival, departure)

    WHEN DATEADD(day, DATEDIFF(day, 0, [arrival]), 0) = a.[date]

    THEN DATEDIFF(minute, arrival, a.[date]+1)

    WHEN DATEADD(day, DATEDIFF(day, 0, [departure]), 0) = a.[date]

    THEN DATEDIFF(minute, a.[date], departure)

    ELSE 24*60 END),

    [CountOfShips]=COUNT(ID)

    FROM #TestData

    CROSS APPLY (

    SELECT DATEADD(day, n, arrival)

    FROM Tally

    WHERE n BETWEEN 0 AND DATEDIFF(day, DATEADD(day, DATEDIFF(day, 0, [arrival]), 0)

    ,DATEADD(day, DATEDIFF(day, 0, [departure]), 0))) a([date])

    GROUP BY [berthId], [shiptype], DATEPART(weekday, [date])

    ORDER BY [berthId], [shiptype], [dayofweek]

    The results (veriryed in Excel) should look like this (taken out berthID and shiptype for readability):

    dayavgmincountships

    4615.001

    5720.001

    7330.001

    1360.001

    2480.001

    5492.502

    61226.673

    71261.673

    1570.001

    2500.002

    3580.001

    575.001

    7330.001

    1360.001

    2900.001

    1600.001

    21050.001

    31440.001

    41440.001

    51440.001

    61440.001

    71440.001

    480.001

    Where a ship lays overnight in Berth the results are not correct. Is the avg done perhaps a level too soon? Do I need to do the CROSS APPLY for the dates in a subselect and then group, count and avg? Is that even possible?

    I will play around a bit and see If I can get it right - Many thanks for your help anyway.

    nano

  • It is difficult to interpret your Excel results because you don't have any identifying keys saying what ship type it is.

    Can you align the results from Excel with the results of this query?

    ;WITH Tally (n) AS (

    SELECT 0 UNION ALL

    SELECT TOP (

    SELECT MAX(DATEDIFF(day, DATEADD(day, DATEDIFF(day, 0, [arrival]), 0)

    ,DATEADD(day, DATEDIFF(day, 0, [departure]), 0)))

    FROM #TestData

    )

    ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

    FROM sys.all_columns)

    SELECT [berthId],

    [shiptype],

    [date]=MAX([date]),

    [dayofweek]=DATEPART(weekday, [date]),

    [AvgDurationInMinutes]=AVG(CASE

    WHEN DATEADD(day, DATEDIFF(day, 0, [arrival]), 0) = DATEADD(day, DATEDIFF(day, 0, [departure]), 0)

    THEN DATEDIFF(minute, arrival, departure)

    WHEN DATEADD(day, DATEDIFF(day, 0, [arrival]), 0) = a.[date]

    THEN DATEDIFF(minute, arrival, a.[date]+1)

    WHEN DATEADD(day, DATEDIFF(day, 0, [departure]), 0) = a.[date]

    THEN DATEDIFF(minute, a.[date], departure)

    ELSE 24*60 END),

    [CountOfShips]=COUNT(ID)

    FROM #TestData

    CROSS APPLY (

    SELECT DATEADD(day, n, arrival)

    FROM Tally

    WHERE n BETWEEN 0 AND DATEDIFF(day, DATEADD(day, DATEDIFF(day, 0, [arrival]), 0)

    ,DATEADD(day, DATEDIFF(day, 0, [departure]), 0))) a([date])

    GROUP BY [berthId], [shiptype], DATEPART(weekday, [date])

    ORDER BY [berthId], [shiptype], [date], [dayofweek]

    DROP TABLE #TestData

    I added a Date column to help identify the base records that were counted.

    Will check back shortly to see if a response came in.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Hi Dwain

    The Excel results are inline with the query I posted - same order, same amount of results - ( I did come across one mistake though - 4th line from the top should be 1080 instead of 360)

    From what I have tried out there is a problem with the cases - a.[date] from the cross apply with the tally table contains the time, so the cases on arrival day and on departure day dont match, which then gives the default 24*60 - but thanks to your code I worked it out:

    the code is the following:

    ;WITH Tally (n) AS (

    SELECT 0 UNION ALL

    SELECT TOP (

    SELECT MAX(DATEDIFF(day, DATEADD(day, DATEDIFF(day, 0, [arrival]), 0)

    ,DATEADD(day, DATEDIFF(day, 0, [departure]), 0)))

    FROM #TestData

    )

    ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

    FROM sys.all_columns)

    SELECT berthID,

    [shiptype],

    [dayofweek]=DATEPART(weekday, [date]),

    [DurationInMinutes]=AVG(CASE

    WHEN DATEADD(day, DATEDIFF(day, 0, [arrival]), 0) = DATEADD(day, DATEDIFF(day, 0, [departure]), 0)

    THEN DATEDIFF(minute, arrival, departure)

    WHEN DATEADD(day, DATEDIFF(day, 0, [arrival]), 0) = DATEADD(day, DATEDIFF(day, 0, a.[date]), 0)

    THEN DATEDIFF(minute, arrival, DATEADD(day, DATEDIFF(day, 0, a.[date]), 0)+1)

    WHEN DATEADD(day, DATEDIFF(day, 0, [departure]), 0) = DATEADD(day, DATEDIFF(day, 0, a.[date]), 0)

    THEN DATEDIFF(minute, DATEADD(day, DATEDIFF(day, 0, a.[date]), 0), departure)

    ELSE 24*60 END),

    [CountOfShips]=COUNT(ID)

    FROM #TestData

    CROSS APPLY (

    SELECT DATEADD(day, n, arrival)

    FROM Tally

    WHERE n BETWEEN 0 AND DATEDIFF(day, DATEADD(day, DATEDIFF(day, 0, [arrival]), 0)

    ,DATEADD(day, DATEDIFF(day, 0, [departure]), 0))) a([date])

    GROUP BY berthID, [shiptype], DATEPART(weekday, [date])

    ORDER BY berthID, [shiptype], [dayofweek]

    I cant see if it can be simplified at the moment - for example taking the time out of the cross apply date ...

    But thank you ever so much

    nano

  • Sounds like you got it working! That's great!

    I thought of a simplification right after I posted the original, so here it is based on the latest query you just posted.

    ;WITH Tally (n) AS (

    SELECT 0 UNION ALL

    SELECT TOP (

    SELECT MAX(DATEDIFF(day, [arrival], [departure]))

    FROM #TestData

    )

    ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

    FROM sys.all_columns)

    SELECT berthID,

    [shiptype],

    [dayofweek]=DATEPART(weekday, [date]),

    [DurationInMinutes]=AVG(CASE

    WHEN arrivalday = departureday

    THEN DATEDIFF(minute, arrival, departure)

    WHEN arrivalday = DATEADD(day, DATEDIFF(day, 0, a.[date]), 0)

    THEN DATEDIFF(minute, arrival, DATEADD(day, DATEDIFF(day, 0, a.[date]), 0)+1)

    WHEN departureday = DATEADD(day, DATEDIFF(day, 0, a.[date]), 0)

    THEN DATEDIFF(minute, DATEADD(day, DATEDIFF(day, 0, a.[date]), 0), departure)

    ELSE 24*60 END),

    [CountOfShips]=COUNT(ID)

    FROM #TestData

    CROSS APPLY (

    SELECT arrivalday=DATEADD(day, DATEDIFF(day, 0, [arrival]), 0)

    ,departureday=DATEADD(day, DATEDIFF(day, 0, [departure]), 0)) b

    CROSS APPLY (

    SELECT DATEADD(day, n, arrival)

    FROM Tally

    WHERE n BETWEEN 0 AND DATEDIFF(day, arrivalday, departureday)) a([date])

    GROUP BY berthID, [shiptype], DATEPART(weekday, [date])

    ORDER BY berthID, [shiptype], [dayofweek]

    At least I think it looks simpler but that can be a matter of taste.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply