Counting Records summed into time overlap

  • I have successfully utilised code from the Overlapping Date https://www.sqlservercentral.com/forums/topic/consolidate-overlapping-date-periods queries with the resulting code below. I have included cutdown data in outputs in this post that I hope gives an understanding of what is happening.

    ;WITH C1 AS (
    SELECT dt.TheDate
    ,jbs.asset_id
    ,ts=jbs.created_date
    ,Type=1
    ,e=NULL
    ,s=ROW_NUMBER() OVER (PARTITION BY dt.thedate, jbs.asset_id ORDER BY jbs.created_date)
    ---,JobCount = COUNT(*) OVER (PARTITION BY dt.thedate, jbs.asset_id ORDER BY jbs.created_date)
    FROM [dbo].Jobs jbs
    Join dbo.DateDimension dt on
    Datediff(day, dt.TheDate , jbs.created_date) <=0
    and Datediff(day, dt.TheDate , isnull(jbs.completed_datetime, getdate())) >=0
    Join [dbo].Assets ast on ast.asset_id = jbs.asset_id
    where jbs.asset_id ='TCS_1270'
    And jbs.[availability] ='Y' and isnull(ast.deadflag,'') <> 'Y'
    UNION ALL
    SELECT dt.TheDate
    ,jbs.asset_id
    ,ts=isnull(jbs.completed_datetime, getdate())
    ,Type=-1
    ,e=ROW_NUMBER() OVER (PARTITION BY dt.thedate, jbs.asset_id ORDER BY jbs.completed_datetime)
    ,s=NULL
    FROM [dbo].Jobs jbs
    Join dbo.DateDimension dt on
    Datediff(day, dt.TheDate , jbs.created_date) <=0
    and Datediff(day, dt.TheDate , isnull(jbs.completed_datetime, getdate())) >=0
    Join [dbo].Assets ast on ast.asset_id = jbs.asset_id
    where jbs.asset_id ='TCS_1270'
    And jbs.[availability] ='Y' and isnull(ast.deadflag,'') <> 'Y'),
    C2 AS (
    SELECT C1.*
    ,se=ROW_NUMBER() OVER (PARTITION BY thedate,asset_id ORDER BY ts, Type DESC)
    FROM C1),
    C3 AS (
    SELECT thedate, asset_Id, ts
    ,grpnm=FLOOR((ROW_NUMBER() OVER (PARTITION BY thedate,asset_id ORDER BY ts)-1) / 2 + 1)
    FROM C2
    WHERE COALESCE(s-(se-s)-1, (se-e)-e) = 0),
    C4 as (SELECT thedate
    , asset_Id
    , StartDate=MIN(ts)
    , EndDate=MAX(ts)
    , DATEDIFF(minute, MIN(ts),MAX(ts)) as MinDown
    FROM C3
    GROUP BY thedate, asset_Id, grpnm)
    select * from C4

    But I need to be able to add a count of the records before they are merged into the non-overlapped time.

    An example of the data in a table showing records with an overlap

    SET ANSI_NULLS ON
    GO

    SET QUOTED_IDENTIFIER ON
    GO

    CREATE TABLE [dbo].[a_add_jobcount](
    [TheDate] [date] NULL,
    [asset_id] [varchar](50) NULL,
    [created_date] [datetime] NULL,
    [completed_datetime] [datetime] NOT NULL,
    [JobCount] [int] NOT NULL
    ) ON [PRIMARY]
    GO

    Insert into [dbo].[a_add_jobcount] VALUES ('2022-08-07','TCS_1270','2022-08-07 17:58:00.000','2022-08-07 23:15:00.000',1)
    Insert into [dbo].[a_add_jobcount] VALUES ('2022-08-07','TCS_1270','2022-08-07 08:32:00.000','2022-08-07 10:36:00.000',1)
    Insert into [dbo].[a_add_jobcount] VALUES ('2022-08-06','TCS_1270','2022-08-06 20:06:00.000','2022-08-06 22:05:00.000',1)
    Insert into [dbo].[a_add_jobcount] VALUES ('2022-08-04','TCS_1270','2022-08-04 18:57:00.000','2022-08-04 19:54:00.000',1)
    Insert into [dbo].[a_add_jobcount] VALUES ('2021-08-24','TCS_1270','2021-08-24 12:45:00.000','2021-08-24 16:00:00.000',1)
    Insert into [dbo].[a_add_jobcount] VALUES ('2021-08-24','TCS_1270','2021-08-24 15:30:00.000','2021-08-24 20:25:00.000',1)

    Although this is not the actual data it is the output from the Overlap code.  When I run the Overlap code at the top it works fine with the two records for 2021-08-24 producing the correct start and end times with the others being returned as expected.

    A table shown what is returned with the datediff(minutes) already calcuated in the mindown column.

    SET ANSI_NULLS ON
    GO

    SET QUOTED_IDENTIFIER ON
    GO

    CREATE TABLE [dbo].[a_fix_overlap](
    [thedate] [date] NULL,
    [asset_Id] [varchar](50) NULL,
    [StartDate] [datetime] NULL,
    [EndDate] [datetime] NULL,
    [MinDown] [int] NULL
    ) ON [PRIMARY]
    GO


    Insert into [dbo].[a_fix_overlap] VALUES ('2021-08-24','TCS_1270','2021-08-24 12:45:00.000','2021-08-24 20:25:00.000',460)
    Insert into [dbo].[a_fix_overlap] VALUES ('2022-08-04','TCS_1270','2022-08-04 18:57:00.000','2022-08-04 19:54:00.000',57)
    Insert into [dbo].[a_fix_overlap] VALUES ('2022-08-06','TCS_1270','2022-08-06 20:06:00.000','2022-08-06 22:05:00.000',119)
    Insert into [dbo].[a_fix_overlap] VALUES ('2022-08-07','TCS_1270','2022-08-07 08:32:00.000','2022-08-07 10:36:00.000',124)
    Insert into [dbo].[a_fix_overlap] VALUES ('2022-08-07','TCS_1270','2022-08-07 17:58:00.000','2022-08-07 23:15:00.000',317)

     

    What I need to do is to Sum the jobcount so that it would produce a value of 2 for the records that are merged - 2021-08-24 but not sum for 2022-08-07.  Having stepped through the "Overlapping Date" code I couldn't see an appropriate place to add in a Sum of JobCounts.  When I try to add the sum in an inline view joined to the top code it does not return correct results.

    ;WITH C1 AS (
    SELECT dt.TheDate
    ,jbs.asset_id
    ,ts=jbs.created_date
    ,Type=1
    ,e=NULL
    ,s=ROW_NUMBER() OVER (PARTITION BY dt.thedate, jbs.asset_id ORDER BY jbs.created_date)
    ---,JobCount = COUNT(*) OVER (PARTITION BY dt.thedate, jbs.asset_id ORDER BY jbs.created_date)
    FROM [dbo].Jobs jbs
    Join dbo.DateDimension dt on
    Datediff(day, dt.TheDate , jbs.created_date) <=0
    and Datediff(day, dt.TheDate , isnull(jbs.completed_datetime, getdate())) >=0
    Join [dbo].Assets ast on ast.asset_id = jbs.asset_id
    where asset_code = 'TCS' ---jbs.asset_id ='TCS_1630'
    And jbs.[availability] ='Y' and isnull(ast.deadflag,'') <> 'Y'
    UNION ALL
    SELECT dt.TheDate
    ,jbs.asset_id
    ,ts=isnull(jbs.completed_datetime, getdate())
    ,Type=-1
    ,e=ROW_NUMBER() OVER (PARTITION BY dt.thedate, jbs.asset_id ORDER BY jbs.completed_datetime)
    ,s=NULL
    FROM [dbo].Jobs jbs
    Join dbo.DateDimension dt on
    Datediff(day, dt.TheDate , jbs.created_date) <=0
    and Datediff(day, dt.TheDate , isnull(jbs.completed_datetime, getdate())) >=0
    Join [dbo].Assets ast on ast.asset_id = jbs.asset_id
    where asset_code = 'TCS' ---jbs.asset_id ='TCS_1630'
    And jbs.[availability] ='Y' and isnull(ast.deadflag,'') <> 'Y'),
    C2 AS (
    SELECT C1.*
    ,se=ROW_NUMBER() OVER (PARTITION BY thedate,asset_id ORDER BY ts, Type DESC)
    FROM C1),
    C3 AS (
    SELECT thedate, asset_Id, ts
    ,grpnm=FLOOR((ROW_NUMBER() OVER (PARTITION BY thedate,asset_id ORDER BY ts)-1) / 2 + 1)
    FROM C2
    WHERE COALESCE(s-(se-s)-1, (se-e)-e) = 0),
    C4 as (SELECT thedate
    , asset_Id
    , StartDate=MIN(ts)
    , EndDate=MAX(ts)
    , DATEDIFF(minute, MIN(ts),MAX(ts)) as MinDown
    FROM C3
    GROUP BY thedate, asset_Id, grpnm)

    ---final part
    Select dt.Thedate
    ,ast.Asset_id
    ,ast.Asset_code
    ,ast.Asset_Type
    ,StartDate DownTimeStart
    ,EndDate DownTimeEnd
    ,MinDown
    ,Isnull(a.JobCount,0) as JobCount
    from [dbo].[Assets] ast
    Join dbo.DateDimension dt on Datediff(day, dt.TheDate , ast.feature_start_date) <=0
    and Datediff(day, dt.TheDate , isnull(nullif(ast.feature_end_date,'3000-01-01 00:00:00.000'), getdate())) >=0
    Join C4 jbs on jbs.thedate = dt.thedate and jbs.asset_id = ast.asset_id
    ---get the individual records and put a count of 1 against them
    ---then join to the Overlap code
    Join (Select dt.TheDate
    ,jbs.asset_id
    ,min(jbs.created_date) as created_date
    ,Isnull(Max(jbs.completed_datetime),getdate()) as completed_datetime
    ,count(*) as JobCount
    from [dbo].[Jobs] jbs
    Join dbo.DateDimension dt on
    Datediff(day, dt.TheDate , jbs.created_date) <=0
    and Datediff(day, dt.TheDate , isnull(jbs.completed_datetime, getdate())) >=0
    Join [dbo].[Assets] ast on ast.asset_id = jbs.asset_id
    Where ast.asset_id = 'TCS_1270'
    And jbs.[availability] ='Y' and isnull(ast.deadflag,'') <> 'Y'
    Group by dt.TheDate, jbs.asset_id) a on a.TheDate = dt.TheDate and a.asset_id = ast.asset_id
    order by thedate, asset_id

    Which produces the double count

     

    ThedateAsset_idAsset_codeAsset_TypeDownTimeStartDownTimeEndMinDownJobCount
    2021-08-24TCS_1270TCSTCS: Traffic Control System2021-08-24 12:45:00.0002021-08-24 20:25:00.0004602
    2022-08-04TCS_1270TCSTCS: Traffic Control System2022-08-04 18:57:00.0002022-08-04 19:54:00.000571
    2022-08-06TCS_1270TCSTCS: Traffic Control System2022-08-06 20:06:00.0002022-08-06 22:05:00.0001191
    2022-08-07TCS_1270TCSTCS: Traffic Control System2022-08-07 08:32:00.0002022-08-07 10:36:00.0001242
    2022-08-07TCS_1270TCSTCS: Traffic Control System2022-08-07 17:58:00.0002022-08-07 23:15:00.0003172​

    Thanks in advance

  • Please create a entity relationship diagram explaining what the tables are and what certain columns mean. and how they are related/joined.

    Provide a simple case of input data and what the output should look like. IE.. to illustrate your goal.

    I think this approach would make your presentation more efficient.

    ----------------------------------------------------

  • Thanks for that.

    The ERD is only 3 tables

    sc_erd

    With the tables and simple sample data in the following code:

    ---create assets_sc table

    SET ANSI_NULLS ON

    GO

     

    SET QUOTED_IDENTIFIER ON

    GO

     

    CREATE TABLE [dbo].[assets_sc](

    [asset_id] [varchar](50) NULL,

    [asset_code] [varchar](50) NULL,

    [kpi_availability] [varchar](50) NULL

    ) ON [PRIMARY]

    GO

     

    ---add required value for example

    Insert into [dbo].[assets_sc] VALUES ('TCS_1270','TCS', 'No)

     

     

    ---create_jobs_sc table

    SET ANSI_NULLS ON

    GO

     

    SET QUOTED_IDENTIFIER ON

    GO

     

    CREATE TABLE [dbo].[jobs_sc](

    [asset_id] [varchar](50) NULL,

    [created_date] [datetime] NULL,

    [completed_datetime] [datetime] NULL,

    [availability] [varchar](20) NULL

    ) ON [PRIMARY]

    GO

     

     

    ---add required values for example

    Insert into [dbo].[jobs_sc] VALUES ('TCS_1270','2022-08-07 17:58:00.000','2022-08-07 23:15:00.000',Y)

    Insert into [dbo].[jobs_sc] VALUES ('TCS_1270','2022-08-07 08:32:00.000','2022-08-07 10:36:00.000',Y)

    Insert into [dbo].[jobs_sc] VALUES ('TCS_1270','2022-08-06 20:06:00.000','2022-08-06 22:05:00.000',Y)

    Insert into [dbo].[jobs_sc] VALUES ('TCS_1270','2022-08-04 18:57:00.000','2022-08-04 19:54:00.000',Y)

    Insert into [dbo].[jobs_sc] VALUES ('TCS_1270','2021-08-24 12:45:00.000','2021-08-24 16:00:00.000',Y)

    Insert into [dbo].[jobs_sc] VALUES ('TCS_1270','2021-08-24 15:30:00.000','2021-08-24 20:25:00.000',Y)

     

     

    ---create datedim

    SET ANSI_NULLS ON

    GO

     

    SET QUOTED_IDENTIFIER ON

    GO

     

    CREATE TABLE [dbo].[datedim_sc](

    [thedate] [date] NULL

    ) ON [PRIMARY]

    GO

     

     

    ---Insert required values

    Insert into [dbo].[datedim_sc] VALUES ('2022-08-07')

    Insert into [dbo].[datedim_sc] VALUES ('2022-08-06')

    Insert into [dbo].[datedim_sc] VALUES ('2022-08-04')

    Insert into [dbo].[datedim_sc] VALUES ('2021-08-24')

     

    With this query producing a good example of the raw results.  The overlapped time is highlighted where the count of jobs will need to roll up to 2.

    SELECT dt.TheDate

    ,jbs.asset_id

    ,jbs.created_date

    ,jbs.completed_datetime

    ,JoBCnt = 1

    ,DateDiff(MINUTE,jbs.created_date,jbs.completed_datetime) as outagetime

    FROM  [dbo].jobs_sc jbs

    Join dbo.datedim_sc  dt  on

    Datediff(day, dt.TheDate , jbs.created_date) <=0

    and Datediff(day, dt.TheDate ,  isnull(jbs.completed_datetime, getdate())) >=0

    Join [dbo].assets_sc ast  on ast.asset_id = jbs.asset_id

    where asset_code = 'TCS'

    And jbs.[availability] ='Y'

    order by created_date, thedate

    sc_detailoutput

    Then the overlap time query rollup works perfectly:

    ;WITH C1 AS (

    SELECT dt.TheDate

    ,jbs.asset_id

    ,ts=jbs.created_date

    ,Type=1

    ,e=NULL

    ,s=ROW_NUMBER() OVER (PARTITION BY dt.thedate, jbs.asset_id ORDER BY jbs.created_date)

    ---,JobCount = COUNT(*) OVER (PARTITION BY dt.thedate, jbs.asset_id ORDER BY jbs.created_date)

    FROM  [dbo].jobs_sc jbs

    Join dbo.datedim_sc  dt  on

    Datediff(day, dt.TheDate , jbs.created_date) <=0

    and Datediff(day, dt.TheDate ,  isnull(jbs.completed_datetime, getdate())) >=0

    Join [dbo].assets_sc ast  on ast.asset_id = jbs.asset_id

    where asset_code = 'TCS'

    And jbs.[availability] ='Y'

    UNION ALL

    SELECT dt.TheDate

    ,jbs.asset_id

    ,ts=isnull(jbs.completed_datetime, getdate())

    ,Type=-1

    ,e=ROW_NUMBER() OVER (PARTITION BY dt.thedate, jbs.asset_id ORDER BY jbs.completed_datetime)

    ,s=NULL

    FROM [dbo].jobs_sc jbs

    Join dbo.datedim_sc  dt  on

    Datediff(day, dt.TheDate , jbs.created_date) <=0

    and Datediff(day, dt.TheDate ,  isnull(jbs.completed_datetime, getdate())) >=0

    Join [dbo].assets_sc ast  on ast.asset_id = jbs.asset_id

    where asset_code = 'TCS'

    And jbs.[availability] ='Y'),

    C2 AS (

    SELECT C1.*

    ,se=ROW_NUMBER() OVER (PARTITION BY thedate,asset_id ORDER BY ts, Type DESC)

    FROM C1),

    C3 AS (

    SELECT thedate, asset_Id, ts

    ,grpnm=FLOOR((ROW_NUMBER() OVER (PARTITION BY thedate,asset_id ORDER BY ts)-1) / 2 + 1)

    FROM C2

    WHERE COALESCE(s-(se-s)-1, (se-e)-e) = 0),

    C4 as (SELECT       thedate

    , asset_Id

    , StartDate=MIN(ts)

    , EndDate=MAX(ts)

    , DATEDIFF(minute, MIN(ts),MAX(ts)) as MinDown

    FROM C3

    GROUP BY thedate, asset_Id, grpnm)

    ---order by asset_id, StartDate

    select * from C4

    sc_overlapfixed

    But when I add in the JobCnt derivation it rolls up the last 2 records because the group by uses the “thedate” column.  If I add the “StartDate” into the group by then the record that is overlapping produces an incorrect JobCnt.

    ;WITH C1 AS (

    SELECT dt.TheDate

    ,jbs.asset_id

    ,ts=jbs.created_date

    ,Type=1

    ,e=NULL

    ,s=ROW_NUMBER() OVER (PARTITION BY dt.thedate, jbs.asset_id ORDER BY jbs.created_date)

    ---,JobCount = COUNT(*) OVER (PARTITION BY dt.thedate, jbs.asset_id ORDER BY jbs.created_date)

    FROM  [dbo].jobs_sc jbs

    Join dbo.datedim_sc  dt  on

    Datediff(day, dt.TheDate , jbs.created_date) <=0

    and Datediff(day, dt.TheDate ,  isnull(jbs.completed_datetime, getdate())) >=0

    Join [dbo].assets_sc ast  on ast.asset_id = jbs.asset_id

    where asset_code = 'TCS'

    And jbs.[availability] ='Y'

    UNION ALL

    SELECT dt.TheDate

    ,jbs.asset_id

    ,ts=isnull(jbs.completed_datetime, getdate())

    ,Type=-1

    ,e=ROW_NUMBER() OVER (PARTITION BY dt.thedate, jbs.asset_id ORDER BY jbs.completed_datetime)

    ,s=NULL

    FROM [dbo].jobs_sc jbs

    Join dbo.datedim_sc  dt  on

    Datediff(day, dt.TheDate , jbs.created_date) <=0

    and Datediff(day, dt.TheDate ,  isnull(jbs.completed_datetime, getdate())) >=0

    Join [dbo].assets_sc ast  on ast.asset_id = jbs.asset_id

    where asset_code = 'TCS'

    And jbs.[availability] ='Y'),

    C2 AS (

    SELECT C1.*

    ,se=ROW_NUMBER() OVER (PARTITION BY thedate,asset_id ORDER BY ts, Type DESC)

    FROM C1),

    C3 AS (

    SELECT thedate, asset_Id, ts

    ,grpnm=FLOOR((ROW_NUMBER() OVER (PARTITION BY thedate,asset_id ORDER BY ts)-1) / 2 + 1)

    FROM C2

    WHERE COALESCE(s-(se-s)-1, (se-e)-e) = 0),

    C4 as (SELECT       thedate

    , asset_Id

    , StartDate=MIN(ts)

    , EndDate=MAX(ts)

    , DATEDIFF(minute, MIN(ts),MAX(ts)) as MinDown

    FROM C3

    GROUP BY thedate, asset_Id, grpnm)

     

    ---Final output with jobcount

    Select Distinct

    "KPI No"=  51

    ,"KPI Type" = 'Availability'

    ,dt.Thedate  --as ReportMonthYear

    ,ast.Asset_id

    ,ast.Asset_code

    ,60*24 as DailyTotalTime

    ,StartDate DownTimeStart

    ,EndDate DownTimeEnd

    ,MinDown

    ,Isnull(a.JobCount,0) as JobCount

    from [dbo].[assets_sc] ast

    join jobs_sc jsc on jsc.asset_id = ast.asset_id

    Join dbo.datedim_sc  dt  on Datediff(day, dt.TheDate , jsc.created_date) <=0

    and Datediff(day, dt.TheDate ,  isnull(nullif(jsc.completed_datetime  ,'3000-01-01 00:00:00.000'), getdate())) >=0

    Join C4 jbs on jbs.thedate = dt.thedate and jbs.asset_id = ast.asset_id

    Join (Select        dt.TheDate

    ,jbs.asset_id

    ,min(jbs.created_date) as created_date

    ,Isnull(Max(jbs.completed_datetime),getdate()) as completed_datetime

    ,count(*) as JobCount

    from [dbo].[jobs_sc] jbs

    Join dbo.datedim_sc  dt  on

    Datediff(day, dt.TheDate , jbs.created_date) <=0

    and Datediff(day, dt.TheDate ,  isnull(jbs.completed_datetime, getdate())) >=0

    Join [dbo].[assets_sc] ast  on ast.asset_id = jbs.asset_id

    Where

    asset_code = 'TCS'

    And jbs.[availability] ='Y'

    Group by dt.TheDate, jbs.asset_id) a on a.TheDate = dt.TheDate and a.asset_id = ast.asset_id

    Where asset_code =  'TCS'

    order by thedate, asset_id

    sc_jobcount

     

    Where I need the result to be this

    sc_jobcountrequired

     

  • It looks like I have resolved this with the query running in reasonable time and producing correct results.  Not sure if the final query is the most efficient but it works fine on the current data.

    The only change was to the final inline query that produces the counts, which works and then you use the join where it's dates are between the dates in the temp table

    Select
    "KPI No"= 51
    ,"KPI Type" = 'Availability'
    ,dt.Thedate --as ReportMonthYear
    ,ast.Asset_id
    ,ast.Asset_code
    ,ast.Asset_Type
    ,60*24 as DailyTotalTime
    ,StartDate DownTimeStart
    ,EndDate DownTimeEnd
    ---the overlap duration seems to get duplicated so a min fixes that
    ,MIN(MinDown) as AssetDailyDowntimeMinutes
    ---sum the count of records and group by the above
    ,SUM(Isnull(a.JobCount,0)) as JobCount
    from [dbo].[Assets] ast
    Join dbo.DateDimension dt on Datediff(day, dt.TheDate , ast.feature_start_date) <=0
    and Datediff(day, dt.TheDate , isnull(nullif(ast.feature_end_date,'3000-01-01 00:00:00.000'), getdate())) >=0
    Join C4 jbs on jbs.thedate = dt.thedate and jbs.asset_id = ast.asset_id
    ---this is where we want to bring in the record count for each record
    Join (Select dt.TheDate
    ,jbs.asset_id
    ,jbs.created_date as created_date
    ,Isnull(jbs.completed_datetime,getdate()) as completed_datetime
    ,JobCount = 1
    from [dbo].[Jobs] jbs
    Join dbo.DateDimension dt on
    Datediff(day, dt.TheDate , jbs.created_date) <=0
    and Datediff(day, dt.TheDate , isnull(jbs.completed_datetime, getdate())) >=0
    Join [dbo].[Assets] ast on ast.asset_id = jbs.asset_id
    Where asset_code = 'TCS'
    And jbs.[availability] ='Y' and isnull(ast.deadflag,'') <> 'Y'
    ) a on a.TheDate = jbs.TheDate and a.asset_id = jbs.asset_id
    ---then join on the dates where they are between the dates being returned in the overlap code
    and (a.created_date between jbs.StartDate and jbs.EndDate)
    and (a.completed_datetime between jbs.StartDate and jbs.EndDate)
    Where asset_code ='TCS'
    ---did a group by so the records count can be summed
    group by dt.TheDate, ast.asset_id,ast.asset_code,ast.asset_type, jbs.StartDate, jbs.EndDate
    order by thedate, asset_id, StartDate

     

Viewing 4 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply