Find # of active days based on status records

  • Given the following table structure and data sample, I need to find the number of days consumers were active in our program in a given time period.

    DECLARE @ConsumerStatus TABLE (StatPK INT, ConsumerFK INT, StatusCode CHAR(1), -- A = Active, I = Inactive

    StatusBeginDate DATETIME)

    INSERT INTO @ConsumerStatus (

    StatPK,

    ConsumerFK,

    StatusCode,

    StatusBeginDate

    )

    SELECT 1,33,'A','12/15/2008' UNION ALL

    SELECT 2,33,'I','02/12/2009' UNION ALL

    SELECT 3,33,'A','05/05/2009' UNION ALL

    SELECT 4,37,'A','01/25/2009' UNION ALL

    SELECT 5,37,'A','02/21/2009' UNION ALL

    SELECT 6,37,'I','03/01/2009'

    What I need to know is how many days each consumer was active between 2 dates. Note that you can go from inactive back to active and you can have 2 or more of the same status in a row.

    In the example above, given the date range of 1/1/2009 through 03/31/2009, consumer 33 was active from 1/1/2009 to 2/11/09. Consumer 37 was active from 1/25/09 to 2/28/09.

    I have tried to come up with a query that calculates the end date for each status record, but couldn't figure it out.

    TIA

    John

    John Deupree

  • Are you looking for the date range a customer was active, or are you asking how many days a customer was active given a specified data range?

  • Give this self join method a try..I apologize for any syntax errors ahead of time...I don't have access to my sql server at the moment to test with....

    select datediff(day, a.statusbegindate, b.statusbegindate), a.consumerfk

    from consumerstatus a, consumerstatus b

    where

    a.statuscode = 'A' and a.statusbegindate =

    (select top 1 statusbegindate from consumerstatus

    where consumerfk = a.consumerfk and

    statusbegin date between '1/1/2009' and '3/31/2009'

    order by statusbegindate asc)

    and

    b.statusbegindate =

    (select top 1 statusbegindate from consumerstatus

    where consumerfk = a.consumerfk and

    statusbegin date between '1/1/2009' and '3/31/2009'

    order by statusbegindate desc)

    group by a.consumerfk

  • John Deupree (4/29/2009)


    What I need to know is how many days each consumer was active between 2 dates. Note that you can go from inactive back to active and you can have 2 or more of the same status in a row.

    There's actually quite a bit of logic that needs to be taken care of. It might be able to be simplified. Anyway, this code:

    DECLARE @ConsumerStatus TABLE (

    StatPK INT,

    ConsumerFK INT,

    StatusCode CHAR(1), -- A = Active, I = Inactive

    StatusBeginDate DATETIME,

    ActiveDays INT DEFAULT 0, --<<<<< LOOK!!! field for storing the number of active days with the date range

    PRIMARY KEY CLUSTERED (ConsumerFK, StatusBeginDate, StatusCode)) --<<<<< LOOK!!! Clustered PK to control the update order.

    INSERT INTO @ConsumerStatus (

    StatPK, ConsumerFK, StatusCode, StatusBeginDate)

    SELECT 1,33,'A','12/15/2008' UNION ALL

    SELECT 2,33,'I','02/12/2009' UNION ALL

    SELECT 3,33,'A','05/05/2009' UNION ALL

    SELECT 4,37,'A','01/25/2009' UNION ALL

    SELECT 5,37,'A','02/21/2009' UNION ALL

    SELECT 6,37,'I','03/01/2009'

    -- declare and initialize the date range variables

    declare @StartDate datetime, @EndDate datetime

    select @StartDate = '20090101', @EndDate = '20090331'

    -- declare variables to be used in the update

    declare @ConsumerFK INT, @StatusCode CHAR(1), @StatusBeginDate DATETIME, @ActiveDays INT

    set @StatusBeginDate = @StartDate

    set @ConsumerFK = -1

    set @StatusCode = ''

    -- perform "quirky update". See below link for details of how this works.

    -- http://www.simple-talk.com/sql/t-sql-programming/importing-text-based-data-workbench/

    update @ConsumerStatus

    set @StatusBeginDate = case when @StatusCode = 'A' and @ConsumerFK ConsumerFK and StatusBeginDate < @StartDate

    then NULL -- if an active record for a new consumer and it starts before the start date then use a null date

    when @StatusCode = 'A' and @ConsumerFK ConsumerFK and StatusBeginDate between @StartDate and @EndDate

    then StatusBeginDate -- if an active record for a new consumer and the date is within the date range, use the record's date

    when StatusBeginDate not between @StartDate and @EndDate

    then NULL -- if the record's date is not within the date range, use a null date

    --when @StatusCode = 'I' then StatusBeginDate

    else @StatusBeginDate -- for everything else, use the date calculated from the prior field's date

    end,

    -- store the number of active days

    @ActiveDays = ActiveDays = case -- assume that consumers are inactive until an active flag is reached.

    when @ConsumerFK ConsumerFK then 0 -- all other new consumers is zero

    when @StatusCode = 'I' then 0 -- if last record was inactive, then zero

    when StatusCode = 'I' and StatusBeginDate @StartDate and StatusBeginDate <= @EndDate

    then DateDiff(dd, @StatusBeginDate, StatusBeginDate) -- get the # of days between current date and last start date

    else DateDiff(dd, @StartDate, StatusBeginDate)

    end,

    -- if the last record and the current record are both active records, then change the @StatusBeginDate

    @StatusBeginDate = case when @StatusCode = 'A' and StatusCode = 'A' then StatusBeginDate else @StatusBeginDate end,

    -- put the current value of the remaining fields into the variables for processing for the next record.

    @ConsumerFK = ConsumerFK,

    @StatusCode = StatusCode

    -- get a sum of each of the consumers

    select

    ConsumerFK, sum(ActiveDays) ActiveDays

    from @ConsumerStatus

    group by ConsumerFK

    returns this result set:

    ConsumerFKActiveDays

    33 42

    37 35

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • WayneS (4/29/2009)


    John Deupree (4/29/2009)


    What I need to know is how many days each consumer was active between 2 dates. Note that you can go from inactive back to active and you can have 2 or more of the same status in a row.

    There's actually quite a bit of logic that needs to be taken care of. It might be able to be simplified. Anyway, this code:

    DECLARE @ConsumerStatus TABLE (

    StatPK INT,

    ConsumerFK INT,

    StatusCode CHAR(1), -- A = Active, I = Inactive

    StatusBeginDate DATETIME,

    ActiveDays INT DEFAULT 0, --<<<<< LOOK!!! field for storing the number of active days with the date range

    PRIMARY KEY CLUSTERED (ConsumerFK, StatusBeginDate, StatusCode)) --<<<<< LOOK!!! Clustered PK to control the update order.

    INSERT INTO @ConsumerStatus (

    StatPK, ConsumerFK, StatusCode, StatusBeginDate)

    SELECT 1,33,'A','12/15/2008' UNION ALL

    SELECT 2,33,'I','02/12/2009' UNION ALL

    SELECT 3,33,'A','05/05/2009' UNION ALL

    SELECT 4,37,'A','01/25/2009' UNION ALL

    SELECT 5,37,'A','02/21/2009' UNION ALL

    SELECT 6,37,'I','03/01/2009'

    -- declare and initialize the date range variables

    declare @StartDate datetime, @EndDate datetime

    select @StartDate = '20090101', @EndDate = '20090331'

    -- declare variables to be used in the update

    declare @ConsumerFK INT, @StatusCode CHAR(1), @StatusBeginDate DATETIME, @ActiveDays INT

    set @StatusBeginDate = @StartDate

    set @ConsumerFK = -1

    set @StatusCode = ''

    -- perform "quirky update". See below link for details of how this works.

    -- http://www.simple-talk.com/sql/t-sql-programming/importing-text-based-data-workbench/

    update @ConsumerStatus

    set @StatusBeginDate = case when @StatusCode = 'A' and @ConsumerFK ConsumerFK and StatusBeginDate < @StartDate

    then NULL -- if an active record for a new consumer and it starts before the start date then use a null date

    when @StatusCode = 'A' and @ConsumerFK ConsumerFK and StatusBeginDate between @StartDate and @EndDate

    then StatusBeginDate -- if an active record for a new consumer and the date is within the date range, use the record's date

    when StatusBeginDate not between @StartDate and @EndDate

    then NULL -- if the record's date is not within the date range, use a null date

    --when @StatusCode = 'I' then StatusBeginDate

    else @StatusBeginDate -- for everything else, use the date calculated from the prior field's date

    end,

    -- store the number of active days

    @ActiveDays = ActiveDays = case -- assume that consumers are inactive until an active flag is reached.

    when @ConsumerFK ConsumerFK then 0 -- all other new consumers is zero

    when @StatusCode = 'I' then 0 -- if last record was inactive, then zero

    when StatusCode = 'I' and StatusBeginDate @StartDate and StatusBeginDate <= @EndDate

    then DateDiff(dd, @StatusBeginDate, StatusBeginDate) -- get the # of days between current date and last start date

    else DateDiff(dd, @StartDate, StatusBeginDate)

    end,

    -- if the last record and the current record are both active records, then change the @StatusBeginDate

    @StatusBeginDate = case when @StatusCode = 'A' and StatusCode = 'A' then StatusBeginDate else @StatusBeginDate end,

    -- put the current value of the remaining fields into the variables for processing for the next record.

    @ConsumerFK = ConsumerFK,

    @StatusCode = StatusCode

    -- get a sum of each of the consumers

    select

    ConsumerFK, sum(ActiveDays) ActiveDays

    from @ConsumerStatus

    group by ConsumerFK

    returns this result set:

    ConsumerFKActiveDays

    33 42

    37 35

    The "quirky update" is the direction I was going, but I would still like to see the OP clarify the question so that we know what is actually being asked.

  • This ended up being more complex that I thought when I first jumped into it. Lot's of conditions to check for. It can probably be improved upon, but... it's working.

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • Lynn Pettis (4/29/2009)


    Are you looking for the date range a customer was active, or are you asking how many days a customer was active given a specified data range?

    Number of days.

    John Deupree

  • John Deupree (4/30/2009)


    Lynn Pettis (4/29/2009)


    Are you looking for the date range a customer was active, or are you asking how many days a customer was active given a specified data range?

    Number of days.

    Based on what? Just the data in the table or a given date range provided to the query?

  • Thanks for the reply. Wow, what an interesting article and technique. I can see how that might have a lot of uses. I never thought about the order in which updates were 'applied' to a table. I assume that that would be the same (in order of the clustered index) even if a WHERE clause was used?

    John

    WayneS (4/29/2009)


    John Deupree (4/29/2009)


    What I need to know is how many days each consumer was active between 2 dates. Note that you can go from inactive back to active and you can have 2 or more of the same status in a row.

    There's actually quite a bit of logic that needs to be taken care of. It might be able to be simplified. Anyway, this code:

    -snip-

    John Deupree

  • Sorry about the confusion. Given the date range 1/1/2009 through 3/31/2009, how many days was each consumer active during that date range. Wayne's 'quirky update' technique above seems to do the trick.

    Thanks

    Lynn Pettis (4/30/2009)


    John Deupree (4/30/2009)


    Lynn Pettis (4/29/2009)


    Are you looking for the date range a customer was active, or are you asking how many days a customer was active given a specified data range?

    Number of days.

    Based on what? Just the data in the table or a given date range provided to the query?

    John Deupree

  • Here's an alternative method that uses CTE's. It produces the same results as Wayne's quirky update given the OP's test data.

    ConsumerFK ActiveDays

    33 42

    37 35

    DECLARE @ConsumerStatus TABLE (

    StatPK INT,

    ConsumerFK INT,

    StatusCode CHAR(1), -- A = Active, I = Inactive

    StatusBeginDate DATETIME

    )

    INSERT INTO @ConsumerStatus (StatPK, ConsumerFK, StatusCode, StatusBeginDate)

    SELECT 1, 33, 'A', '20081215' UNION ALL

    SELECT 2, 33, 'I', '20090212' UNION ALL

    SELECT 3, 33, 'A', '20090505' UNION ALL

    SELECT 4, 37, 'A', '20090125' UNION ALL

    SELECT 5, 37, 'A', '20090221' UNION ALL

    SELECT 6, 37, 'I', '20090301'

    DECLARE @StartDate datetime

    DECLARE @EndDate datetime

    SELECT @StartDate = '20090101', @EndDate = '20090331'

    ;WITH cteCN AS (

    SELECT

    ROW_NUMBER() OVER (PARTITION BY ConsumerFK ORDER BY StatusBeginDate)

    - ROW_NUMBER() OVER (PARTITION BY ConsumerFK, StatusCode ORDER BY StatusBeginDate) AS cn,

    ConsumerFK,

    StatusCode,

    StatusBeginDate

    FROM @ConsumerStatus

    ),

    cteAgg AS (

    SELECT

    ROW_NUMBER() OVER (PARTITION BY ConsumerFK ORDER BY MIN(StatusBeginDate)) AS rn,

    ConsumerFK,

    StatusCode,

    MIN(StatusBeginDate) AS StatusBeginDate

    FROM cteCN

    GROUP BY ConsumerFK, StatusCode, cn

    )

    SELECT

    A.ConsumerFK,

    SUM(DATEDIFF(day,

    CASE WHEN (A.StatusBeginDate >= @StartDate) THEN A.StatusBeginDate ELSE @StartDate END,

    CASE WHEN (I.StatusBeginDate <= @EndDate) THEN I.StatusBeginDate ELSE @EndDate + 1 END)) AS ActiveDays

    FROM cteAgg A

    LEFT OUTER JOIN cteAgg I ON (A.ConsumerFK = I.ConsumerFK AND A.rn = I.rn - 1)

    WHERE (A.StatusBeginDate @StartDate)

    AND (A.StatusCode = 'A')

    GROUP BY A.ConsumerFK

    ORDER BY A.ConsumerFK

  • Very nice! It took me a while to get my head around using the row number functions to eliminate the consecutive 'same status' records.

    Thanks for the reply.

    andrewd.smith (4/30/2009)


    Here's an alternative method that uses CTE's. It produces the same results as Wayne's quirky update given the OP's test data.

    ConsumerFK ActiveDays

    33 42

    37 35

    DECLARE @ConsumerStatus TABLE (

    StatPK INT,

    ConsumerFK INT,

    StatusCode CHAR(1), -- A = Active, I = Inactive

    StatusBeginDate DATETIME

    )

    INSERT INTO @ConsumerStatus (StatPK, ConsumerFK, StatusCode, StatusBeginDate)

    SELECT 1, 33, 'A', '20081215' UNION ALL

    SELECT 2, 33, 'I', '20090212' UNION ALL

    SELECT 3, 33, 'A', '20090505' UNION ALL

    SELECT 4, 37, 'A', '20090125' UNION ALL

    SELECT 5, 37, 'A', '20090221' UNION ALL

    SELECT 6, 37, 'I', '20090301'

    DECLARE @StartDate datetime

    DECLARE @EndDate datetime

    SELECT @StartDate = '20090101', @EndDate = '20090331'

    ;WITH cteCN AS (

    SELECT

    ROW_NUMBER() OVER (PARTITION BY ConsumerFK ORDER BY StatusBeginDate)

    - ROW_NUMBER() OVER (PARTITION BY ConsumerFK, StatusCode ORDER BY StatusBeginDate) AS cn,

    ConsumerFK,

    StatusCode,

    StatusBeginDate

    FROM @ConsumerStatus

    ),

    cteAgg AS (

    SELECT

    ROW_NUMBER() OVER (PARTITION BY ConsumerFK ORDER BY MIN(StatusBeginDate)) AS rn,

    ConsumerFK,

    StatusCode,

    MIN(StatusBeginDate) AS StatusBeginDate

    FROM cteCN

    GROUP BY ConsumerFK, StatusCode, cn

    )

    SELECT

    A.ConsumerFK,

    SUM(DATEDIFF(day,

    CASE WHEN (A.StatusBeginDate >= @StartDate) THEN A.StatusBeginDate ELSE @StartDate END,

    CASE WHEN (I.StatusBeginDate <= @EndDate) THEN I.StatusBeginDate ELSE @EndDate + 1 END)) AS ActiveDays

    FROM cteAgg A

    LEFT OUTER JOIN cteAgg I ON (A.ConsumerFK = I.ConsumerFK AND A.rn = I.rn - 1)

    WHERE (A.StatusBeginDate @StartDate)

    AND (A.StatusCode = 'A')

    GROUP BY A.ConsumerFK

    ORDER BY A.ConsumerFK

    John Deupree

  • andrewd.smith (4/30/2009)


    Here's an alternative method that uses CTE's. It produces the same results as Wayne's quirky update given the OP's test data.

    ConsumerFK ActiveDays

    33 42

    37 35

    DECLARE @ConsumerStatus TABLE (

    StatPK INT,

    ConsumerFK INT,

    StatusCode CHAR(1), -- A = Active, I = Inactive

    StatusBeginDate DATETIME

    )

    INSERT INTO @ConsumerStatus (StatPK, ConsumerFK, StatusCode, StatusBeginDate)

    SELECT 1, 33, 'A', '20081215' UNION ALL

    SELECT 2, 33, 'I', '20090212' UNION ALL

    SELECT 3, 33, 'A', '20090505' UNION ALL

    SELECT 4, 37, 'A', '20090125' UNION ALL

    SELECT 5, 37, 'A', '20090221' UNION ALL

    SELECT 6, 37, 'I', '20090301'

    DECLARE @StartDate datetime

    DECLARE @EndDate datetime

    SELECT @StartDate = '20090101', @EndDate = '20090331'

    ;WITH cteCN AS (

    SELECT

    ROW_NUMBER() OVER (PARTITION BY ConsumerFK ORDER BY StatusBeginDate)

    - ROW_NUMBER() OVER (PARTITION BY ConsumerFK, StatusCode ORDER BY StatusBeginDate) AS cn,

    ConsumerFK,

    StatusCode,

    StatusBeginDate

    FROM @ConsumerStatus

    ),

    cteAgg AS (

    SELECT

    ROW_NUMBER() OVER (PARTITION BY ConsumerFK ORDER BY MIN(StatusBeginDate)) AS rn,

    ConsumerFK,

    StatusCode,

    MIN(StatusBeginDate) AS StatusBeginDate

    FROM cteCN

    GROUP BY ConsumerFK, StatusCode, cn

    )

    SELECT

    A.ConsumerFK,

    SUM(DATEDIFF(day,

    CASE WHEN (A.StatusBeginDate >= @StartDate) THEN A.StatusBeginDate ELSE @StartDate END,

    CASE WHEN (I.StatusBeginDate <= @EndDate) THEN I.StatusBeginDate ELSE @EndDate + 1 END)) AS ActiveDays

    FROM cteAgg A

    LEFT OUTER JOIN cteAgg I ON (A.ConsumerFK = I.ConsumerFK AND A.rn = I.rn - 1)

    WHERE (A.StatusBeginDate @StartDate)

    AND (A.StatusCode = 'A')

    GROUP BY A.ConsumerFK

    ORDER BY A.ConsumerFK

    Andrew, this method is really neat.

    I've seen it used a few times before, but I don't quite understand it.

    Would you be able to explain?

  • This ROW_NUMBER() difference method used to solve the problem of combining sequential rows originated from the following article, or at least that's where I first saw it used.

    http://www.sqlmag.com/Article/ArticleID/93606/sql_server_93606.html

    Suppose we have some data that records a status code and a numeric value for a sequence of dates.

    CREATE TABLE #TestData (

    DateStamp datetime,

    Status char(1),

    Number int

    )

    GO

    INSERT #TestData (DateStamp, Status, Number)

    SELECT '2009-03-02T00:00:00', 'A', 15 UNION ALL

    SELECT '2009-03-03T00:00:00', 'A', 10 UNION ALL

    SELECT '2009-03-04T00:00:00', 'A', 12 UNION ALL

    SELECT '2009-03-05T00:00:00', 'B', 17 UNION ALL

    SELECT '2009-03-06T00:00:00', 'B', 23 UNION ALL

    SELECT '2009-03-09T00:00:00', 'B', 18 UNION ALL

    SELECT '2009-03-10T00:00:00', 'A', 9 UNION ALL

    SELECT '2009-03-11T00:00:00', 'B', 20 UNION ALL

    SELECT '2009-03-12T00:00:00', 'B', 8 UNION ALL

    SELECT '2009-03-13T00:00:00', 'A', 12 UNION ALL

    SELECT '2009-03-16T00:00:00', 'A', 10 UNION ALL

    SELECT '2009-03-17T00:00:00', 'A', 16

    Given this data, you might wish to:

    a) eliminate consecutive rows with duplicate statuses,

    b) determine the row count and sum of the Number column for consecutive rows with the same status,

    c) find the longest consecutive sequence of rows with the same status.

    Obviously simply grouping by the Status column will no do, since it will aggregate non-consecutive rows. However, consecutive sequences can be grouped using the difference between two ROW_NUMBER() function expressions.

    To see how this works, check the results of the following query:

    SELECT DateStamp, Status,

    ROW_NUMBER() OVER (ORDER BY DateStamp) AS RN,

    ROW_NUMBER() OVER (PARTITION BY Status ORDER BY DateStamp) AS GN,

    ROW_NUMBER() OVER (ORDER BY DateStamp) - ROW_NUMBER() OVER (PARTITION BY Status ORDER BY DateStamp) AS CN

    FROM #TestData

    ORDER BY DateStamp

    DateStamp Status RN GN CN

    2009-03-02 00:00:00.000 A 1 1 0

    2009-03-03 00:00:00.000 A 2 2 0

    2009-03-04 00:00:00.000 A 3 3 0

    2009-03-05 00:00:00.000 B 4 1 3

    2009-03-06 00:00:00.000 B 5 2 3

    2009-03-09 00:00:00.000 B 6 3 3

    2009-03-10 00:00:00.000 A 7 4 3

    2009-03-11 00:00:00.000 B 8 4 4

    2009-03-12 00:00:00.000 B 9 5 4

    2009-03-13 00:00:00.000 A 10 5 5

    2009-03-16 00:00:00.000 A 11 6 5

    2009-03-17 00:00:00.000 A 12 7 5

    Note that for this query, column CN = RN - GN

    Let's examine a row at the start of a consecutive sequence, e.g. the 4th row with a DateStamp of '2009-03-05T00:00:00' and Status of 'B'. The value of the RN column is 4 and the value of the GN column is 1, so the difference (CN) is 3. The 5th row (when sorted by DateStamp) also has a Status of 'B', so the value of the RN and GN columns will both be incremented by 1, to 5 and 2 respectively, and therefore the difference (CN) is also 3. The 6th row also has a Status of 'B', so again the value of the RN and GN columns will both be incremented by 1, to 6 and 3 respectively, and the difference is still 3.

    This shows that the ROW_NUMBER() difference expression will have the same value for consecutive rows with the same Status, but we also need to be sure that a different consecutive sequence with the same value of the Status column does not produce the same value for the ROW_NUMBER() difference expression.

    The next row with a Status of 'B' in the test data is the 8th row. The value of the RN column is 8, but the value of the GN column is 4, so the difference (CN) is now 4. The difference between the two ROW_NUMBER() expressions has incremented between the two consecutive sequences of rows with Status 'B' because the intervening 7th row with a Status of 'A' causes the RN expression to increment by one for all subsequent rows, but does not change the value of the GN column for any subsequent rows with a Status of 'B'.

    Note that the difference between the two ROW_NUMBER() expressions in column CN does not by itself uniquely identify consecutive sequences, but it does uniquely identify a sequence for a given Status value. For example, the 4th, 5th and 6th rows with Status 'B' have the same CN value (3) as the 7th row with Status 'A', but the combination of the Status and CN columns uniquely identifies each consecutive sequence, therefore, both these columns must be included in any GROUP BY clause when aggregating the data. Also, it is important that the same sort order is used for the two ROW_NUMBER() expressions RN and GN.

    So, going back to the list of 3 potential requirements:

    a) eliminate consecutive rows with duplicate statuses

    ;WITH cteCN AS (

    SELECT

    DateStamp,

    Status,

    ROW_NUMBER() OVER (ORDER BY DateStamp)

    - ROW_NUMBER() OVER (PARTITION BY Status ORDER BY DateStamp) AS CN

    FROM #TestData

    )

    SELECT

    MIN(DateStamp) AS DateStamp,

    Status

    FROM cteCN

    GROUP BY Status, CN

    ORDER BY MIN(DateStamp)

    b) determine the row count and sum of the Number column for consecutive rows with the same status

    ;WITH cteCN AS (

    SELECT

    DateStamp,

    Status,

    Number,

    ROW_NUMBER() OVER (ORDER BY DateStamp)

    - ROW_NUMBER() OVER (PARTITION BY Status ORDER BY DateStamp) AS CN

    FROM #TestData

    )

    SELECT

    MIN(DateStamp) AS RangeStart,

    MAX(DateStamp) AS RangeEnd,

    Status,

    COUNT(DateStamp) AS RangeCount,

    SUM(Number) AS RangeTotal

    FROM cteCN

    GROUP BY Status, CN

    ORDER BY MIN(DateStamp)

    c) find the longest consecutive sequence of rows with the same status

    ;WITH cteCN AS (

    SELECT

    DateStamp,

    Status,

    ROW_NUMBER() OVER (ORDER BY DateStamp)

    - ROW_NUMBER() OVER (PARTITION BY Status ORDER BY DateStamp) AS CN

    FROM #TestData

    ),

    cteAgg AS (

    SELECT

    Status,

    COUNT(*) AS SequenceLength

    FROM cteCN

    GROUP BY Status, CN

    )

    SELECT

    Status,

    MAX(SequenceLength) AS MaximumSequenceLength

    FROM cteAgg

    GROUP BY Status

    ORDER BY Status

  • Wow!

    Thanks for taking the time out to write such a thorough explanation, Andrew!

    I really appreciate it. Makes a lot more sense now. 🙂

    Thanks again,

    Goldie

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply