build groups by finding specific "first and last records"

  • Hello,

    After receiving great help here, I am coming back for a little follow-up. I need to refine the previous question by adding an additional step which goes down to this:

    • find the first "startSignal" in an ordered table and match it with the first "stopSignal" to form a group
    • there can be multiple round-trips

    I've tried to play around with analytic functions and simple min/max joins, but I failed to assign the actual first stop AFTER a running start...

    create table signals
    (
    id int,
    startSignal int,
    stopSignal int
    )

    -- some noise to be ignored
    insert into signals values (1, 0, 1);
    insert into signals values (2, 0, 1);
    insert into signals values (3, 0, 1);
    insert into signals values (4, 0, 0);
    insert into signals values (5, 0, 0);

    -- this is an actual START (group 1)
    -- because it is seen before a closing STOP
    insert into signals values (6, 1, 0);

    -- these are not STARTs because there was no STOP yet
    insert into signals values (7, 1, 0);
    insert into signals values (8, 1, 0);
    insert into signals values (9, 1, 0);

    -- more noise
    -- (still within the valid/running START group)
    insert into signals values (10, 0, 0);
    insert into signals values (11, 0, 0);

    -- now's the relevant STOP
    -- I want to assign this to the first start (id 6)
    insert into signals values (12, 0, 1);

    -- even more noise
    insert into signals values (13, 0, 1);
    insert into signals values (14, 0, 1);
    insert into signals values (15, 0, 0);
    insert into signals values (16, 0, 0);

    -- there could be more "round-trips" (groups)
    -- with diffent noise, but the rules stay the same.
    -- it's always: first START and first STOP after this (running) start form a group

    -- (same pattern here for demo)

    -- some noise to be ignored
    insert into signals values (21, 0, 1);
    insert into signals values (22, 0, 1);
    insert into signals values (23, 0, 1);
    insert into signals values (24, 0, 0);
    insert into signals values (25, 0, 0);

    -- this is an actual START (group 1)
    -- because it is seen before a closing STOP
    insert into signals values (26, 1, 0);

    -- these are not STARTs because there was no STOP yet
    insert into signals values (27, 1, 0);
    insert into signals values (28, 1, 0);
    insert into signals values (29, 1, 0);

    -- more noise
    -- (still within the valid/running START group)
    insert into signals values (30, 0, 0);
    insert into signals values (31, 0, 0);

    -- now's the relevant STOP
    -- I want to assign this to the first start (id 26)
    insert into signals values (32, 0, 1);

    -- even more noise
    insert into signals values (33, 0, 1);
    insert into signals values (34, 0, 1);
    insert into signals values (35, 0, 0);
    insert into signals values (36, 0, 0);

    Given the demo table above, I want to find & join:

    • id 6 (start) joined with id 12 (first stop after the current start)
    • id 26 (start) joined with id 32 (first stop after the current start)

    Background: I am trying to analyze some hardware data from a HWInfi64 log 🙂

  • This is known as a "Data Islands" problem - you're looking for groups of records in an ordered set.

    You have a identity value on the table, so this gets easy. The trick is to first add a ROW_NUMBER() column to the query and subtract that value from the id value to get a grouping value. To get the first and last records for each set of "Start = 1" rows:

    --Select rows where Start = 1 
    SELECT id, StartSignal,StopSignal
    FROM Signals
    WHERE StartSignal = 1;

    id startSignal stopSignal
    ----------- ----------- -----------
    6 1 0
    7 1 0
    8 1 0
    9 1 0
    26 1 0
    27 1 0
    28 1 0
    29 1 0

    -- Add a ROW_NUMBER() OVER (ORDER BY id), and also subtract that value from the
    -- id value to get a grouping value:
    SELECT id, row_number() OVER (ORDER BY (id)) AS RowNum,
    id - (row_number() OVER (ORDER BY (id))) AS GroupVal,
    StartSignal,StopSignal
    FROM Signals
    WHERE StartSignal = 1

    id RowNum GroupVal StartSignal StopSignal
    ----------- -------------------- -------------------- ----------- -----------
    6 1 5 1 0
    7 2 5 1 0
    8 3 5 1 0
    9 4 5 1 0
    26 5 21 1 0
    27 6 21 1 0
    28 7 21 1 0
    29 8 21 1 0

    -- Group by GroupVal and pull MIN() and MAX() ids to get your islands:
    WITH StartOnes AS
    (
    SELECT id, id - (row_number() OVER (ORDER BY (id))) AS GroupVal, StartSignal, StopSignal
    FROM signals
    WHERE StartSignal = 1
    )
    SELECT GroupVal, min(id) AS StartOneBegin, max(id) AS StartOneEnd
    FROM StartOnes
    GROUP BY GroupVal
    ORDER BY 1;

    GroupVal StartOneBegin StartOneEnd
    -------------------- ------------- -----------
    5 6 9
    21 26 29

    -- Repeat these steps WHERE StopSignal = 1 and you have your groups.

    Eddie Wuerch
    MCM: SQL

Viewing 2 posts - 1 through 1 (of 1 total)

You must be logged in to reply to this topic. Login to reply