Maximum Consecutive Count

  • I have sat with this problem for over two weeks now, it is time that I admit defeat. Time for me to consult the experts...here goes:

    I have a table as follows

       DECLARE @rawData TABLE (
             [rowNo] INT IDENTITY
            ,[DateTimeNow] DATETIME
            ,[Calibration] INT
            ,[Turbidity] REAL );


    If I select all fields I am returned a result that resembles the following:

    DateTime,Calibration,Turbidity
    1/03/2018 0:00,0,7.409
    1/03/2018 0:01,0,9.401
    1/03/2018 0:02,0,8.409
    1/03/2018 0:03,0,4.403
    1/03/2018 0:04,1,4.401
    1/03/2018 0:05,0,4.402
    1/03/2018 0:06,0,4.411
    1/03/2018 0:07,0,4.477

    2/03/2018 0:00,0,0.409
    2/03/2018 0:01,0,NULL
    2/03/2018 0:02,0,NULL
    2/03/2018 0:03,0,0.403

    I need to create a query that'll return a the MAXIMUM CONSECUTIVE COUNT per day a condition(s) is true. The result should look something like this (based on the above Result):

    MaxDate,Max
    2018-03-01 00:00:00.000,4
    2018-03-02 00:00:00.000,2

    The closest I have been able to get is as follows:

        select DATEADD(dd, DATEDIFF(dd,0,DateTimeNow),0) as MaxDate,
       sum(case when (Turbidity > 2 and Calibration<> 1 and Turbidity is not null) then 1
         when Turbidity is null then 1 else 0 end) as [Max]
            from @rawData t
            group by DATEADD(dd, DATEDIFF(dd,0,DateTimeNow),0)

    Unfortunately this will not result in the outcome I am after. Please any help would be greatly appreciated, I have no hair left to pull out.

    Thank you
    Juan

  • what makes a condition = true? 
    looking at your query 6 records from the 01/03/2018 are true (1), and from 02/03/2018 2 are true (1)

    so your results would be

    2018-03-01 00:00:00.000,6
    2018-03-02 00:00:00.000,2

    ***The first step is always the hardest *******

  • I don't get the "4" . As far as I get it it should be 3
    first date
    first 3 are true - this is a consecutive 3
    next one is false (calibration = 1)
    next 3 are true - this is a consecutive 3
    second date
    first 1 is false - turbidity < 2
    next 2 are true (turbidity is null) - this is a consecutive 2
    next one is false (turbidity < 2
    DateTime          Calibration    Turbidity   True/False
    01/03/2018 00:00    0             7.409       True
    01/03/2018 00:01    0             9.401       True
    01/03/2018 00:02    0             8.409       True
    01/03/2018 00:03    1             4.403    
    01/03/2018 00:04    0             4.401       True
    01/03/2018 00:05    0             4.402       True
    01/03/2018 00:06    0             4.411       True
    02/03/2018 00:00    0             0.409    
    02/03/2018 00:01    0             NULL        True
    02/03/2018 00:02    0             NULL        True
    02/03/2018 00:03    0             0.403    

  • frederico_fonseca - Monday, May 28, 2018 6:48 AM

    I don't get the "4" . As far as I get it it should be 3
    first date
    first 3 are true - this is a consecutive 3
    next one is false (calibration = 1)
    next 3 are true - this is a consecutive 3
    second date
    first 1 is false - turbidity < 2
    next 2 are true (turbidity is null) - this is a consecutive 2
    next one is false (turbidity < 2
    DateTime          Calibration    Turbidity   True/False
    01/03/2018 00:00    0             7.409       True
    01/03/2018 00:01    0             9.401       True
    01/03/2018 00:02    0             8.409       True
    01/03/2018 00:03    1             4.403    
    01/03/2018 00:04    0             4.401       True
    01/03/2018 00:05    0             4.402       True
    01/03/2018 00:06    0             4.411       True
    02/03/2018 00:00    0             0.409    
    02/03/2018 00:01    0             NULL        True
    02/03/2018 00:02    0             NULL        True
    02/03/2018 00:03    0             0.403    

    frederico_fonseca, you are very right sir. I have updated the original post to reflect this. Thanks for pointing out the mistake.

  • SGT_squeequal - Monday, May 28, 2018 3:29 AM

    what makes a condition = true? 
    looking at your query 6 records from the 01/03/2018 are true (1), and from 02/03/2018 2 are true (1)

    so your results would be

    2018-03-01 00:00:00.000,6
    2018-03-02 00:00:00.000,2

    SGT_squeequal, The condition(s) are as follows:

    if Turbidity > 2 AND calibration = 0 Then 1
    else if Turbidity == NULL then 1
    else 0
    endif

    My current query calculates the total number of times the query is true, when what I need is the MAXIMUM CONSECUTIVE COUNT, i.e. maximum occurrence of the condition being true consecutively.

    Hope this makes sense.

    Juan

  • Well, here's something that works, but I'm not sure it's performance is going to be good as row count scales upward:
    CREATE TABLE #RawData (
        rowNo int IDENTITY(1,1) NOT NULL PRIMARY KEY CLUSTERED,
        DateTimeNow datetime,
        Calibration int,
        Turbidity REAL
    );
    INSERT INTO #RawData (DateTimeNow, Calibration, Turbidity)
        VALUES    ('01/03/2018 00:00:00', 0, 7.409),
                ('01/03/2018 00:01:00', 0, 9.401),
                ('01/03/2018 00:02:00', 0, 8.409),
                ('01/03/2018 00:03:00', 1, 4.403),
                ('01/03/2018 00:04:00', 0, 4.401),
                ('01/03/2018 00:05:00', 0, 4.402),
                ('01/03/2018 00:06:00', 0, 4.411),
                ('01/03/2018 00:07:00', 0, 4.477),
                ('02/03/2018 00:00:00', 0, 0.409),
                ('02/03/2018 00:01:00', 0, NULL),
                ('02/03/2018 00:02:00', 0, NULL),
                ('02/03/2018 00:03:00', 0, 0.403);

    WITH ALL_DATA AS (

        SELECT RD.rowNo, RD.DateTimeNow, RD.Calibration, RD.Turbidity,
            CONVERT(date, RD.DateTimeNow) AS ObservationDate,
            CASE
                WHEN RD.Turbidity > 2 AND RD.Calibration <> 1 THEN 1
                WHEN Turbidity IS NULL THEN 1
                ELSE 0
            END AS MeetsCondition
        FROM #RawData AS RD
    ),
        GROUPED_DATA AS (

            SELECT AD.ObservationDate,
                COUNT(GD.rowNo) AS ConsecutiveGroupNumber
            FROM ALL_DATA AS AD
                LEFT OUTER JOIN (
                    SELECT *,
                        AD.rowNo - ROW_NUMBER() OVER(PARTITION BY AD.ObservationDate ORDER BY AD.rowNo) AS IsConsecutive
                    FROM ALL_DATA AS AD
                    WHERE AD.MeetsCondition = 1
                    ) AS GD
                    ON AD.rowNo = GD.rowNo
            GROUP BY AD.ObservationDate, GD.IsConsecutive
    )
    SELECT GD.ObservationDate,
        MAX(GD.ConsecutiveGroupNumber) AS ConsecutiveDaysCount
    FROM GROUPED_DATA AS GD
    GROUP BY GD.ObservationDate
    ORDER BY GD.ObservationDate;

    DROP TABLE #RawData;

    I'll look forward to someone seeing a shortcut here and posting a better performing query.

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • sgmunson - Tuesday, May 29, 2018 8:06 AM

    Well, here's something that works, but I'm not sure it's performance is going to be good as row count scales upward:
    CREATE TABLE #RawData (
        rowNo int IDENTITY(1,1) NOT NULL PRIMARY KEY CLUSTERED,
        DateTimeNow datetime,
        Calibration int,
        Turbidity REAL
    );
    INSERT INTO #RawData (DateTimeNow, Calibration, Turbidity)
        VALUES    ('01/03/2018 00:00:00', 0, 7.409),
                ('01/03/2018 00:01:00', 0, 9.401),
                ('01/03/2018 00:02:00', 0, 8.409),
                ('01/03/2018 00:03:00', 1, 4.403),
                ('01/03/2018 00:04:00', 0, 4.401),
                ('01/03/2018 00:05:00', 0, 4.402),
                ('01/03/2018 00:06:00', 0, 4.411),
                ('01/03/2018 00:07:00', 0, 4.477),
                ('02/03/2018 00:00:00', 0, 0.409),
                ('02/03/2018 00:01:00', 0, NULL),
                ('02/03/2018 00:02:00', 0, NULL),
                ('02/03/2018 00:03:00', 0, 0.403);

    WITH ALL_DATA AS (

        SELECT RD.rowNo, RD.DateTimeNow, RD.Calibration, RD.Turbidity,
            CONVERT(date, RD.DateTimeNow) AS ObservationDate,
            CASE
                WHEN RD.Turbidity > 2 AND RD.Calibration <> 1 THEN 1
                WHEN Turbidity IS NULL THEN 1
                ELSE 0
            END AS MeetsCondition
        FROM #RawData AS RD
    ),
        GROUPED_DATA AS (

            SELECT AD.ObservationDate,
                COUNT(GD.rowNo) AS ConsecutiveGroupNumber
            FROM ALL_DATA AS AD
                LEFT OUTER JOIN (
                    SELECT *,
                        AD.rowNo - ROW_NUMBER() OVER(PARTITION BY AD.ObservationDate ORDER BY AD.rowNo) AS IsConsecutive
                    FROM ALL_DATA AS AD
                    WHERE AD.MeetsCondition = 1
                    ) AS GD
                    ON AD.rowNo = GD.rowNo
            GROUP BY AD.ObservationDate, GD.IsConsecutive
    )
    SELECT GD.ObservationDate,
        MAX(GD.ConsecutiveGroupNumber) AS ConsecutiveDaysCount
    FROM GROUPED_DATA AS GD
    GROUP BY GD.ObservationDate
    ORDER BY GD.ObservationDate;

    DROP TABLE #RawData;

    I'll look forward to someone seeing a shortcut here and posting a better performing query.

    Steve, you are a legend! It worked perfectly! Thank you!...as for me...so much to learn in this space still 🙁

  • Another option using the temp table provided by Steve, because it was a fun distraction:


    SELECT final.MaxDate, final.Max FROM
    (
        SELECT TOP 1 WITH TIES
            DateTimeNow AS MaxDate,
            count(gr.GroupNumber) AS Max
        FROM
        (
            SELECT
                CONVERT(date, RD.DateTimeNow) AS DateTimeNow
                ,(
                    rd.rowNo -
                    ROW_NUMBER() over (partition by
                        CASE
                            WHEN Turbidity > 2 AND Calibration <> 1 THEN 1
                            WHEN Turbidity IS NULL THEN 1
                            ELSE 0
                        END, CONVERT(date, RD.DateTimeNow)
                        ORDER by CONVERT(date, RD.DateTimeNow), rowNo)
                    ) as GroupNumber
            FROM #RawData rd
        ) gr
        GROUP BY gr.DateTimeNow, gr.GroupNumber
        ORDER BY ROW_NUMBER() OVER (PARTITION BY gr.DateTimeNow ORDER BY count(gr.GroupNumber) desc)
    ) final
    ORDER BY final.MaxDate;

  • juanleroux - Tuesday, May 29, 2018 2:47 PM

    sgmunson - Tuesday, May 29, 2018 8:06 AM

    Well, here's something that works, but I'm not sure it's performance is going to be good as row count scales upward:
    CREATE TABLE #RawData (
        rowNo int IDENTITY(1,1) NOT NULL PRIMARY KEY CLUSTERED,
        DateTimeNow datetime,
        Calibration int,
        Turbidity REAL
    );
    INSERT INTO #RawData (DateTimeNow, Calibration, Turbidity)
        VALUES    ('01/03/2018 00:00:00', 0, 7.409),
                ('01/03/2018 00:01:00', 0, 9.401),
                ('01/03/2018 00:02:00', 0, 8.409),
                ('01/03/2018 00:03:00', 1, 4.403),
                ('01/03/2018 00:04:00', 0, 4.401),
                ('01/03/2018 00:05:00', 0, 4.402),
                ('01/03/2018 00:06:00', 0, 4.411),
                ('01/03/2018 00:07:00', 0, 4.477),
                ('02/03/2018 00:00:00', 0, 0.409),
                ('02/03/2018 00:01:00', 0, NULL),
                ('02/03/2018 00:02:00', 0, NULL),
                ('02/03/2018 00:03:00', 0, 0.403);

    WITH ALL_DATA AS (

        SELECT RD.rowNo, RD.DateTimeNow, RD.Calibration, RD.Turbidity,
            CONVERT(date, RD.DateTimeNow) AS ObservationDate,
            CASE
                WHEN RD.Turbidity > 2 AND RD.Calibration <> 1 THEN 1
                WHEN Turbidity IS NULL THEN 1
                ELSE 0
            END AS MeetsCondition
        FROM #RawData AS RD
    ),
        GROUPED_DATA AS (

            SELECT AD.ObservationDate,
                COUNT(GD.rowNo) AS ConsecutiveGroupNumber
            FROM ALL_DATA AS AD
                LEFT OUTER JOIN (
                    SELECT *,
                        AD.rowNo - ROW_NUMBER() OVER(PARTITION BY AD.ObservationDate ORDER BY AD.rowNo) AS IsConsecutive
                    FROM ALL_DATA AS AD
                    WHERE AD.MeetsCondition = 1
                    ) AS GD
                    ON AD.rowNo = GD.rowNo
            GROUP BY AD.ObservationDate, GD.IsConsecutive
    )
    SELECT GD.ObservationDate,
        MAX(GD.ConsecutiveGroupNumber) AS ConsecutiveDaysCount
    FROM GROUPED_DATA AS GD
    GROUP BY GD.ObservationDate
    ORDER BY GD.ObservationDate;

    DROP TABLE #RawData;

    I'll look forward to someone seeing a shortcut here and posting a better performing query.

    Steve, you are a legend! It worked perfectly! Thank you!...as for me...so much to learn in this space still 🙁

    You're most welcome.   Keep learning, and don't be afraid to experiment with new ways to write queries.   If you stumble, post a question and learn something new about what the problem was and most importantly, WHY it was trouble.   Those kinds of questions are some of the fastest ways to learn.

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply