Grouping by date hour with "buckets"

  • Running SQL 2008
    Table structure below

    CREATE TABLE [dbo].[ICA](

    [MsgDateTime] [datetime] NULL,

     [UserName] [varchar](255) NULL,

    [Application] [varchar](255) NULL,

    [StartTime] [datetime] NULL,

    [EndTime] [varchar](255) NULL,

    [Duration] [varchar](255) NULL,

     [ConnectionID] [varchar](255) NULL

    ) ON [PRIMARY]

    I am trying to get the Count of users in 60 minute intervals by Date based off the amount of minutes / hours in the duration field like below
    MSGDateTime  Interval  Users
    2017-02-01          60         130
    2017-02-01          120        200
    2017-02-01         180        4000

    here is an extract of the data that is in the db now

    MsgDateTimeUserNameApplicationStartTimeEndTimeDurationConnectionID
    12/22/16 14:34jshmo NA12/22/16 18:5112/22/2016 19:350:44:04 1e87569
    12/22/16 14:34juser NA12/22/16 19:3512/22/2016 19:360:00:23 1a1e463
    12/22/16 16:14 echeek NA12/22/16 12:5512/22/2016 21:158:20:20 19f101e
    12/22/16 16:14 dbaker NA12/22/16 18:0812/22/2016 21:163:08:15 1e8178a
    12/22/16 16:18hdev NA12/22/16 20:0812/22/2016 21:211:12:40 14b8e6a
    12/22/16 16:20 alone NA12/22/16 21:1412/22/2016 21:210:06:59 1e9ae81
    12/22/16 16:26 yelni NA12/22/16 21:2212/22/2016 21:260:04:11 146fb6e
    12/22/16 16:27 oco NA12/22/16 20:1712/22/2016 21:291:11:43 1a22f8b
    12/22/16 16:28 lkey NA12/22/16 20:4912/22/2016 21:280:38:03 146c6c0
    12/22/16 16:28jfk NA12/22/16 20:0512/22/2016 21:281:22:50 1467dce
    12/22/16 16:19 sdailey NA12/22/16 21:0412/22/2016 21:190:14:40 146de84
    12/22/16 16:19 vene NA12/22/16 21:1512/22/2016 21:190:03:45 146f08c
  • helpful consumable format, still working on the solution:

    IF OBJECT_ID('tempdb.[dbo].[#ICA]') IS NOT NULL
    DROP TABLE [dbo].[#ICA]
    GO
    CREATE TABLE [dbo].[#ICA] (
    [MsgDateTime] DATETIME            NULL,
    [UserName]  VARCHAR(255)               NULL,
    [Application] VARCHAR(255)               NULL,
    [StartTime]  DATETIME            NULL,
    [EndTime]   VARCHAR(255)               NULL,
    [Duration]  VARCHAR(255)               NULL,
    [ConnectionID] VARCHAR(255)               NULL)

    INSERT INTO [#ICA]
    SELECT '12/22/16 14:34','jshmo','NA','12/22/16 18:51','12/22/2016 19:35','0:44:04','1e87569' UNION ALL
    SELECT '12/22/16 14:34','juser','NA','12/22/16 19:35','12/22/2016 19:36','0:00:23','1a1e463' UNION ALL
    SELECT '12/22/16 16:14','echeek','NA','12/22/16 12:55','12/22/2016 21:15','8:20:20','19f101e' UNION ALL
    SELECT '12/22/16 16:14','dbaker','NA','12/22/16 18:08','12/22/2016 21:16','3:08:15','1e8178a' UNION ALL
    SELECT '12/22/16 16:18','hdev','NA','12/22/16 20:08','12/22/2016 21:21','1:12:40','14b8e6a' UNION ALL
    SELECT '12/22/16 16:20','alone','NA','12/22/16 21:14','12/22/2016 21:21','0:06:59','1e9ae81' UNION ALL
    SELECT '12/22/16 16:26','yelni','NA','12/22/16 21:22','12/22/2016 21:26','0:04:11','146fb6e' UNION ALL
    SELECT '12/22/16 16:27','oco','NA','12/22/16 20:17','12/22/2016 21:29','1:11:43','1a22f8b' UNION ALL
    SELECT '12/22/16 16:28','lkey','NA','12/22/16 20:49','12/22/2016 21:28','0:38:03','146c6c0' UNION ALL
    SELECT '12/22/16 16:28','jfk','NA','12/22/16 20:05','12/22/2016 21:28','1:22:50','1467dce' UNION ALL
    SELECT '12/22/16 16:19','sdailey','NA','12/22/16 21:04','12/22/2016 21:19','0:14:40','146de84' UNION ALL
    SELECT '12/22/16 16:19','vene','NA','12/22/16 21:15','12/22/2016 21:19','0:03:45','146f08c'

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • Try something like:

    SELECT
          CASE
             WHEN i.Duration BETWEEN '0:00:00' AND '1:00:00' THEN '0-60'
             ELSE
             CASE
                WHEN i.Duration BETWEEN '1:00:01' AND '2:00:00' THEN '60-120'
                ELSE
                   CASE
                      WHEN i.Duration BETWEEN '2:00:01' AND '3:00:00' THEN '120-180'
                      ELSE '>180'
                   END
             END
          END   Buckets
        , Count(*) [Count]
    FROM #ICA i
    GROUP BY CASE
             WHEN i.Duration BETWEEN '0:00:00' AND '1:00:00' THEN '0-60'
             ELSE
             CASE
                WHEN i.Duration BETWEEN '1:00:01' AND '2:00:00' THEN '60-120'
                ELSE
                   CASE
                      WHEN i.Duration BETWEEN '2:00:01' AND '3:00:00' THEN '120-180'
                      ELSE '>180'
                   END
             END
          END;
    DROP TABLE #ICA;
    GO
  • Joe Torre - Thursday, February 2, 2017 1:19 PM

    Try something like:

    SELECT
          CASE
             WHEN i.Duration BETWEEN '0:00:00' AND '1:00:00' THEN '0-60'
             ELSE
             CASE
                WHEN i.Duration BETWEEN '1:00:01' AND '2:00:00' THEN '60-120'
                ELSE
                   CASE
                      WHEN i.Duration BETWEEN '2:00:01' AND '3:00:00' THEN '120-180'
                      ELSE '>180'
                   END
             END
          END   Buckets
        , Count(*) [Count]
    FROM #ICA i
    GROUP BY CASE
             WHEN i.Duration BETWEEN '0:00:00' AND '1:00:00' THEN '0-60'
             ELSE
             CASE
                WHEN i.Duration BETWEEN '1:00:01' AND '2:00:00' THEN '60-120'
                ELSE
                   CASE
                      WHEN i.Duration BETWEEN '2:00:01' AND '3:00:00' THEN '120-180'
                      ELSE '>180'
                   END
             END
          END;
    DROP TABLE #ICA;
    GO

    The problem with this approach is that you may have missed a CASE as in the 8:20:20 duration.  Using Lowell's setup, I came up with the following.

    SELECT 60 * (DATEDIFF(HOUR, '00:00', i.Duration) + 1), COUNT(DISTINCT i.UserName)
    FROM #ICA i
    GROUP BY DATEDIFF(HOUR, '00:00', i.Duration)

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • Your approach much better.

  • So I tried the below

    SELECT 60 * (DATEDIFF(HOUR, '00:00', i.Duration) + 1), COUNT(DISTINCT i.UserName)

    FROM #ICA i

    GROUP BY DATEDIFF(HOUR, '00:00', i.Duration)

    and I got this error

    Msg 241, Level 16, State 1, Line 1

    Conversion failed when converting date and/or time from character string.

  • Joe
    I tried yours and that seems to work,  I have about 18 months of data in this database, how do I get the buckets by day instead of an aggregate of all of them?

  • magsdtev - Thursday, February 2, 2017 3:33 PM

    So I tried the below

    SELECT 60 * (DATEDIFF(HOUR, '00:00', i.Duration) + 1), COUNT(DISTINCT i.UserName)

    FROM #ICA i

    GROUP BY DATEDIFF(HOUR, '00:00', i.Duration)

    and I got this error

    Msg 241, Level 16, State 1, Line 1

    Conversion failed when converting date and/or time from character string.

    This indicates that you probably have some bad data.  You can find this by the following.

    SELECT *
    FROM #ICA
    WHERE TRY_CAST(Duration AS TIME) IS NULL

    This is one of the hazards of not using the correct data type.

    See if you can figure out for yourself how to add a bucket for each day.  You learn much better when you try things for yourself.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • I just realized that the duration field is a varchar not a time field.  I can work around that with a datediff on the start and end times.
    The buckets per day is really stumping me as I am really not sure how to even start that

  • I would create a calendar table with a granularity of day with a StartOfDay set to 12:00:00 AM of the day and a EndOfDay set to 23:90:59.997 (for datetime). With a left join from your table you can generate empty days as well.

  • Sorry for all of the Questions Joe as I am a bit new to doing this sort of thing.

    What would that table look like
    Month_Day Column
    StartOf Day Colum
    EndOf Day column

  • The answer is it depends on your requirements. Many organizations run a fiscal year from June to July so it might include fiscal Month column with start and end of accounting periods. Typically other dates of interest such as company holidays, pay periods, months, years....

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply