Sum when between

  • Hi

    I am trying to do a count, to see how many times N-CO occurred until the Re-Act was completed.

    Could you help out?

    Expecting to see 3 not 1.


    CREATE TABLE #temp
    (
    ServiceNumber varchar(20) NOT NULL
    ,Identifier varchar(20)NOT NULL
    ,CreatedDate DATETIME NOT NULL
    ,CompletedDate DATETIME NOT NULL
    ,SN_Type varchar(20) NOT NULL
    ,SN_Status varchar(20) NOT NULL
    )
    ;
    INSERT INTO #temp
    VALUES('01031861999','64074558792','20160729','20160805','Re-Activattion','N-CO')
    INSERT INTO #temp
    VALUES('02031861999','64074558792','20160810','20160810','Re-Activattion','N-CO')
    INSERT INTO #temp
    VALUES('02031861999','64074558792','20160812','20160812','Re-Activattion','N-CO')
    INSERT INTO #temp
    VALUES('03031861999','64074558792','20160811','20160811','Re-Activattion','COMP')
    INSERT INTO #temp
    VALUES('03031861999','64074558792','20160811','20160813','Re-Activattion','N-CO')
    ;
    ;

    SELECT
        Identifier
        ,RE_ACT_COMPLETION_TIME
        ,SUM(CASE WHEN RE_ACT_COMPLETION_TIME < RE_ACT_NCO_TIME THEN 0 ELSE 1 END) AS [RE-AN NCO #]

    FROM
    (
        select identifier
                    ,case
                 when sum(case when SN_STATUS='COMP' and SN_TYPE = 'Re-Activattion' then 1 else 0 end)>0
                 then str(datediff(day
                                 ,min(case
                                         when SN_TYPE = 'Re-Activattion'
                                         then CreatedDate
                                         else null
                                        end
                                     )
                                 ,min(case
                                         when (SN_TYPE = 'Re-Activattion'
                                            and SN_STATUS='COMP'
                                                )
                                         then CompletedDate
                                         else null
                                        end
                                     )
                                 )
                         )
                 when sum(case when SN_TYPE='Re-Activattion' then 1 else 0 end)>0
                 then 'NOT COMP'
                 else 'NO RE-ACT'
             end
             as RE_ACT_COMPLETION_TIME
      
             ,case
                 when sum(case when SN_STATUS='COMP' and SN_TYPE = 'Re-Activattion' then 1 else 0 end)>0
                 then str(datediff(day
                                 ,min(case
                                         when SN_TYPE = 'Re-Activattion'
                                         then CreatedDate
                                         else null
                                        end
                                     )
                                 ,min(case
                                         when (SN_TYPE = 'Re-Activattion'
                                            and SN_STATUS='N-CO'
                                                )
                                         then CreatedDate
                                         else null
                                        end
                                     )
                                 )
                         )
                 else '0'
             end
             as RE_ACT_NCO_TIME

                
         from #temp
         WHERE Identifier = '64074558792'
         group by identifier
    )A
    GROUP BY
        Identifier
        ,RE_ACT_COMPLETION_TIME
    ;

  • Looks like you missed the CREATE TABLE part of this script...

  • Thanks, updated.

  • Look up Itzik Ben-Gan Gaps and Islands TSQL to find a number of solutions to this classic problem.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • TheSQLGuru - Wednesday, July 12, 2017 8:13 PM

    Look up Itzik Ben-Gan Gaps and Islands TSQL to find a number of solutions to this classic problem.

    Could you provide any more help?

    I have been reading about it, still cant get to the answer.

  • danijel.bozic87 - Wednesday, July 12, 2017 10:07 PM

    TheSQLGuru - Wednesday, July 12, 2017 8:13 PM

    Look up Itzik Ben-Gan Gaps and Islands TSQL to find a number of solutions to this classic problem.

    Could you provide any more help?

    I have been reading about it, still cant get to the answer.

    Sadly I cannot. I am just absolutely overwhelmed. Sorry!! 🙁

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Running your query with the SUM removed gets you 1 row.  So your sub-query is not working as you expect.

    Looking at your data set though, are you sure it should be 3?
    I am just trying to figure out what this should be sorted by to get "3" and the only thing I can see is the Service number?

    My brain first said "use a cursor" , but I'm forcing myself to stop thinking that cursors are valid solutions.
    Does something like this work:

    CREATE TABLE #temp
    (
    ServiceNumber varchar(20) NOT NULL
    ,Identifier varchar(20)NOT NULL
    ,CreatedDate DATETIME NOT NULL
    ,CompletedDate DATETIME NOT NULL
    ,SN_Type varchar(20) NOT NULL
    ,SN_Status varchar(20) NOT NULL
    )
    ;
    INSERT INTO #temp
    VALUES('01031861999','64074558792','20160729','20160805','Re-Activattion','N-CO')
    INSERT INTO #temp
    VALUES('02031861999','64074558792','20160810','20160810','Re-Activattion','N-CO')
    INSERT INTO #temp
    VALUES('02031861999','64074558792','20160812','20160812','Re-Activattion','N-CO')
    INSERT INTO #temp
    VALUES('03031861999','64074558792','20160811','20160811','Re-Activattion','COMP')
    INSERT INTO #temp
    VALUES('03031861999','64074558792','20160811','20160813','Re-Activattion','N-CO')
    INSERT INTO #temp
    VALUES('11031861999','74074558792','20160729','20160805','Re-Activattion','N-CO')
    INSERT INTO #temp
    VALUES('12031861999','74074558792','20160810','20160810','Re-Activattion','N-CO')
    INSERT INTO #temp
    VALUES('12031861999','74074558792','20160812','20160812','Re-Activattion','N-CO')
    INSERT INTO #temp
    VALUES('13031861999','74074558792','20160811','20160811','Re-Activattion','COMP')
    INSERT INTO #temp
    VALUES('13031861999','74074558792','20160811','20160813','Re-Activattion','N-CO')

    ;
    ;
    WITH cte AS (
    SELECT [#temp].[ServiceNumber] ,
         [#temp].[Identifier] ,
         [#temp].[CreatedDate] ,
         [#temp].[CompletedDate] ,
         [#temp].[SN_Type] ,
         [#temp].[SN_Status],
         ROW_NUMBER() OVER (PARTITION BY identifier ORDER BY createdDate, completedDate) AS row
    FROM [#temp]
    ),
    cte2 AS (
    select identifier
          ,case
         when sum(case when SN_STATUS='COMP' and SN_TYPE = 'Re-Activattion' then 1 else 0 end)>0
         then str(datediff(day
              ,min(case
                 when SN_TYPE = 'Re-Activattion'
                 then CreatedDate
                 else null
                end
               )
              ,min(case
                 when (SN_TYPE = 'Re-Activattion'
                  and SN_STATUS='COMP'
                   )
                 then CompletedDate
                 else null
                end
               )
              )
            )
         when sum(case when SN_TYPE='Re-Activattion' then 1 else 0 end)>0
         then 'NOT COMP'
         else 'NO RE-ACT'
       end
       as RE_ACT_COMPLETION_TIME
             FROM #temp
             group by identifier
    )
    SELECT [cte].Identifier,RE_ACT_COMPLETION_TIME, row-1 AS [RE-AN NCO #]
    FROM [cte]
    JOIN cte2 ON [cte2].[Identifier] = [cte].[Identifier]
    WHERE [cte].sn_status LIKE 'COMP'
    DROP TABLE #temp

    I changed your sample data a little bit so to show it works with more than 1 Identifier.  The reason I got "2" as the RE-AN NCO # is that for each Idnetifier, there are only 2 rows with a Created and Completed date before the COMP date.  You could change it to order by the ServiceNumber which would give you the "3" that you were talking about, but that feels like an odd way to order things to me.

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • bmg002 - Thursday, July 13, 2017 3:57 PM

    Looking at your data set though, are you sure it should be 3?
    I am just trying to figure out what this should be sorted by to get "3" and the only thing I can see is the Service number?

    Hi, thanks for responding and giving this a go, have been asking around with little to no help.

    The reason its 3 is because I would count the number of (created dates).
    And the very last one CreatedDate 11.08.2016 is same as CompetedDate of COMP 11.08.2016 so count it as well as the N-Co Occurred on the same day as it was completed, hence 3. 

    I gave you query a go but there seems to be a compatibility issue and I am unable to run it.


    My compatibility:


    Any suggestions? Alternative ways of doing it?

    Thanks again bmg002!

  • Here's another to try, this works with your sample data

    WITH Src AS (
    SELECT ServiceNumber, Identifier, CreatedDate, CompletedDate, SN_Type, SN_Status,
       ROW_NUMBER() OVER(PARTITION BY Identifier ORDER BY CreatedDate, CASE WHEN SN_Status = 'N-CO' THEN 0 ELSE 1 END, CompletedDate) AS rn,
       ROW_NUMBER() OVER(PARTITION BY Identifier ORDER BY CreatedDate, CASE WHEN SN_Status = 'N-CO' THEN 0 ELSE 1 END, CompletedDate) -
       ROW_NUMBER() OVER(PARTITION BY Identifier,SN_Status ORDER BY CreatedDate, CompletedDate) AS grp
    FROM #temp
    ),
    Grouped AS (
    SELECT Identifier, SN_Status,
       MIN(CreatedDate) AS StartDate,
         COUNT(*) AS [RE-AN NCO #],
         MAX(rn) AS LastRn
    FROM Src
    GROUP BY Identifier, SN_Status, grp
    )
    SELECT s.Identifier,
       DATEDIFF(day,g.StartDate,s.CreatedDate) AS RE_ACT_COMPLETION_TIME,
         g.[RE-AN NCO #]
    FROM Src s
    INNER JOIN Grouped g ON g.Identifier = s.Identifier
           AND g.SN_Status = 'N-CO'
           AND g.LastRn + 1 = s.rn
    WHERE s.SN_Status = 'COMP'
    ORDER BY rn;

    ____________________________________________________

    Deja View - The strange feeling that somewhere, sometime you've optimised this query before

    How to get the best help on a forum

    http://www.sqlservercentral.com/articles/Best+Practices/61537
  • danijel.bozic87 - Thursday, July 13, 2017 5:39 PM

    bmg002 - Thursday, July 13, 2017 3:57 PM

    Looking at your data set though, are you sure it should be 3?
    I am just trying to figure out what this should be sorted by to get "3" and the only thing I can see is the Service number?

    Hi, thanks for responding and giving this a go, have been asking around with little to no help.

    The reason its 3 is because I would count the number of (created dates).
    And the very last one CreatedDate 11.08.2016 is same as CompetedDate of COMP 11.08.2016 so count it as well as the N-Co Occurred on the same day as it was completed, hence 3. 

    I gave you query a go but there seems to be a compatibility issue and I am unable to run it.


    My compatibility:


    Any suggestions? Alternative ways of doing it?

    Thanks again bmg002!

    Not sure if your database is compatability level 80 or 90, but you posted this in the wrong section then.  Compatability level 110 is SQL 2012.  you are running it in either 2008 or 2005 compatability mode.
    I do not have one of those to test the script on at this time, but I THINK that if you put the thing into brackets it should fix that?  That is replace "row-1" with "(row - 1)".
    As for getting "3", since the created date for the COMP and N-CO is the same, I am not sure how you would determine that with confidence.  The best way I can see to do it would be to change the ROW_NUMBER() line to:
    ROW_NUMBER() OVER (PARTITION BY identifier ORDER BY createdDate, SN_Status DESC) AS row

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • Thanks for responding 

    But why is it counting 1 day if it should be from the min ( created) 20160729
    until the 20160811 when the status is completed.


    CREATE TABLE #temp
    (
    ServiceNumber varchar(20) NOT NULL
    ,Identifier varchar(20)NOT NULL
    ,CreatedDate DATETIME NOT NULL
    ,CompletedDate DATETIME NOT NULL
    ,SN_Type varchar(20) NOT NULL
    ,SN_Status varchar(20) NOT NULL
    )
    ;

    INSERT INTO #temp
    VALUES('01031861999','64074558792','20160729','20160805','Re-Activattion','N-CO');
    INSERT INTO #temp
    VALUES('01031861999','64074558792','20160729','20160805','Re-Activattion','PARTIALLY');
    INSERT INTO #temp
    VALUES('02031861999','64074558792','20160810','20160810','Re-Activattion','N-CO');
    INSERT INTO #temp
    VALUES('02031861999','64074558792','20160812','20160812','Re-Activattion','N-CO');
    INSERT INTO #temp
    VALUES('03031861999','64074558792','20160811','20160811','Re-Activattion','COMP');
    INSERT INTO #temp
    VALUES('03031861999','64074558792','20160811','20160813','Re-Activattion','N-CO');

    WITH Src AS (
    SELECT ServiceNumber, Identifier, CreatedDate, CompletedDate, SN_Type, SN_Status,
     ROW_NUMBER() OVER(PARTITION BY Identifier ORDER BY CreatedDate, CASE WHEN SN_Status = 'N-CO' THEN 0 ELSE 1 END, CompletedDate) AS rn,
     ROW_NUMBER() OVER(PARTITION BY Identifier ORDER BY CreatedDate, CASE WHEN SN_Status = 'N-CO' THEN 0 ELSE 1 END, CompletedDate) -
     ROW_NUMBER() OVER(PARTITION BY Identifier,SN_Status ORDER BY CreatedDate, CompletedDate) AS grp
    FROM #temp
    ),
    Grouped AS (
    SELECT Identifier, SN_Status,
     MIN(CreatedDate) AS StartDate,
      COUNT(*) AS [RE-AN NCO #],
      MAX(rn) AS LastRn
    FROM Src
    GROUP BY Identifier, SN_Status, grp
    )
    SELECT s.Identifier,
     DATEDIFF(day,g.StartDate,s.CreatedDate) AS RE_ACT_COMPLETION_TIME,
      g.[RE-AN NCO #]
    FROM Src s
    INNER JOIN Grouped g ON g.Identifier = s.Identifier
       AND g.SN_Status = 'N-CO'
       AND g.LastRn + 1 = s.rn
    WHERE s.SN_Status = 'COMP'
    ORDER BY rn;

  • That's because you've now got a SN_Status of 'PARTIALLY' that the query didn't cater for

    WITH Src AS (
    SELECT ServiceNumber, Identifier, CreatedDate, CompletedDate, SN_Type, SN_Status,
    ROW_NUMBER() OVER(PARTITION BY Identifier ORDER BY CreatedDate, CASE WHEN SN_Status = 'COMP' THEN 1 ELSE 0 END, CompletedDate) AS rn,
    ROW_NUMBER() OVER(PARTITION BY Identifier ORDER BY CreatedDate, CASE WHEN SN_Status = 'COMP' THEN 1 ELSE 0 END, CompletedDate) -
    ROW_NUMBER() OVER(PARTITION BY Identifier,CASE WHEN SN_Status = 'COMP' THEN 1 ELSE 0 END ORDER BY CreatedDate, CompletedDate) AS grp
    FROM #temp
    ),
    Grouped AS (
    SELECT Identifier, CASE WHEN SN_Status = 'COMP' THEN 1 ELSE 0 END AS IsCOMP,
    MIN(CreatedDate) AS StartDate,
     COUNT(*) AS [RE-AN NCO #],
     MAX(rn) AS LastRn
    FROM Src
    GROUP BY Identifier, CASE WHEN SN_Status = 'COMP' THEN 1 ELSE 0 END, grp
    )
    SELECT s.Identifier,
    DATEDIFF(day,g.StartDate,s.CreatedDate) AS RE_ACT_COMPLETION_TIME,
     g.[RE-AN NCO #]
    FROM Src s
    INNER JOIN Grouped g ON g.Identifier = s.Identifier
      AND g.IsCOMP = 0
      AND g.LastRn + 1 = s.rn
    WHERE s.SN_Status = 'COMP'
    ORDER BY rn;

    ____________________________________________________

    Deja View - The strange feeling that somewhere, sometime you've optimised this query before

    How to get the best help on a forum

    http://www.sqlservercentral.com/articles/Best+Practices/61537
  • Great, thank you!
    Now one more thing, and please if you can be patient because that is the type of data set it is (limited) and this is the only type of solution I could think of.
    If I wanted to add a tolerance of 2 weeks to each identifier so i know which groups of statuses to count.
    How would I do that using your code?
    The picture below will put it into perspective.

    Example
    Consider the identifier (61030203647) as a person
    Wanting their internet activated
    In 2017 Jan they wanted a reactivation
    Which was not completed.
    Then there was one in Feb 2017 which was completed.
    However because there is nothing to group on and to determine if it was all part of the same "request" I think grouping it by weeks of statuses would provide and indication of whether its part of the same "request".
    If you look at the account 64074558792 all the statuses were close together so you can make the assumption it was part of the same "request" hence why I thought maybe grouping it by 2 weeks would provide a valid summary.

    Hope you see the dilemma I am in and having to resort to an assumption like this to count the accurate "requests"
    "
    CREATE TABLE #temp
    (
    ServiceNumber varchar(20) NOT NULL
    ,Identifier varchar(20)NOT NULL
    ,CreatedDate DATETIME NOT NULL
    ,CompletedDate DATETIME NOT NULL
    ,SN_Type varchar(20) NOT NULL
    ,SN_Status varchar(20) NOT NULL
    )
    ;

    INSERT INTO #temp
    VALUES('01031861999','64074558792','20160729','20160805','Re-Activattion','N-CO');
    INSERT INTO #temp
    VALUES('01031861999','64074558792','20160729','20160805','Re-Activattion','PARTIALLY');
    INSERT INTO #temp
    VALUES('02031861999','64074558792','20160810','20160810','Re-Activattion','N-CO');
    INSERT INTO #temp
    VALUES('02031861999','64074558792','20160812','20160812','Re-Activattion','N-CO');
    INSERT INTO #temp
    VALUES('03031861999','64074558792','20160811','20160811','Re-Activattion','COMP');
    INSERT INTO #temp
    VALUES('03031861999','64074558792','20160811','20160813','Re-Activattion','N-CO');
    INSERT INTO #temp
    VALUES ('318403853','61030203647','20160427','20160427','Re-Activattion', 'COMP');
    INSERT INTO #temp
    VALUES('318616723','61030203647','20160427','20160427','Re-Activattion', 'N-CO');
    INSERT INTO #temp
    VALUES('318637013','61030203647','20160422','20160422','Re-Activattion', 'N-CO');
    INSERT INTO #temp
    VALUES('318639191','61030203647','20170210','20170210','Re-Activattion', 'COMP');
    INSERT INTO #temp
    VALUES('318639191','61030203647','20170110','20170110','Re-Activattion', 'N-CO');

  • I think first, I would figure out some way to order the data.
    Next, I wouldn't assume that 2 weeks means that it is a new request.  I would assume that once a COMP occurs in the column, you should assume that any future N-CO's and PARTIALLY's are part of a new request.
    But it depends on your business logic.

    From looking at the data, that is how I'd decide when a "new" request comes up anyways.  Unfortunately, we cannot decide for you how you want to decide what a "new" request is.  Is there any chance you can change the software to have some sort of requestID column?

    EDIT - also, I am a little confused how you are getting your "want" results, specifically the RE_ACT_COMPLETION column.  How do you get 5 and 0?  When I look at it, I see 3 and 2.

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • bmg002 - Friday, July 14, 2017 9:47 AM

    I think first, I would figure out some way to order the data.
    Next, I wouldn't assume that 2 weeks means that it is a new request.  I would assume that once a COMP occurs in the column, you should assume that any future N-CO's and PARTIALLY's are part of a new request.
    But it depends on your business logic.

    From looking at the data, that is how I'd decide when a "new" request comes up anyways.  Unfortunately, we cannot decide for you how you want to decide what a "new" request is.  Is there any chance you can change the software to have some sort of requestID column?

    Unfortunately no, can't change software.
    The business logic allows for N-CO and PARTIALLYS to occur without being followed by a COMP.
    But you are right that may work, any future statuses after comp could be considered NEW.
    Any idea how you would formulate that into code give the limited columns you have to group on?

Viewing 15 posts - 1 through 15 (of 18 total)

You must be logged in to reply to this topic. Login to reply