Trouble grouping data series when a group crosses a date line (in some cases) and when a group shows up again further down in the sequence.

  • I have a set of manufacturing data snapshots. The snapshots include the serial number of the product. I need a listing of that product as it is represented in the snapshots in a sequential order. The catch is that the serial number could cross over to the next day if the product is run late enough. That product could also appear if it is rerun after other serial numbers.

    The problem is that each run of the product has to be represented separately but every windowing function we try places the same value on each set. If a date component is added to any of the ORDER BYs then the group that crosses to the next date gets a new value. Note that the index_num column represents row numbers in a single file. When the numbers start over, a new file was read.

    So what I am looking for is a dataset that looks kind of like this:
    my_group    index_num    serial_number    trans_date    trans_time  
    1                          1            A07060            2018-01-01    21:15:00.0000000
    1                          2            A07060            2018-01-01    21:30:00.0000000
    1                          3            A07060            2018-01-01    21:45:00.0000000
    1                          4            A07060            2018-01-01    22:00:00.0000000
    2                          5            A07059            2018-01-01    22:15:00.0000000
    2                          6            A07059            2018-01-01    22:30:00.0000000
    2                          7            A07059            2018-01-01    22:45:00.0000000
    2                          8            A07059            2018-01-01    23:00:00.0000000
    3                          9            A07060            2018-01-01    23:15:00.0000000
    3                         10            A07060            2018-01-01    23:30:00.0000000
    3                         11            A07060            2018-01-01    23:45:00.0000000
    3                          1            A07060            2018-01-02    00:00:00.0000000
    4                          2            A07061            2018-01-02    00:30:00.0000000
    4                          3            A07061            2018-01-02    00:45:00.0000000
    4                          4            A07061            2018-01-02    01:00:00.0000000
    4                          5            A07061            2018-01-02    01:15:00.0000000
    5                          6            A07060            2018-01-02    01:30:00.0000000
    5                          7            A07060            2018-01-02    01:45:00.0000000
    5                          8            A07060            2018-01-02    02:00:00.0000000
    5                          9            A07060            2018-01-02    02:15:00.0000000

    Below is a sample dataset, the query functions we have already tried, and the current results:

    CREATE TABLE #serial_numbers (index_num int, serial_number nvarchar(30), trans_date date, trans_time time)

    INSERT INTO #serial_numbers VALUES (1, 'A07060', '1/1/2018', '9:15 PM')
    INSERT INTO #serial_numbers VALUES (2, 'A07060', '1/1/2018', '9:30 PM')
    INSERT INTO #serial_numbers VALUES (3, 'A07060', '1/1/2018', '9:45 PM')
    INSERT INTO #serial_numbers VALUES (4, 'A07060', '1/1/2018', '10:00 PM')
    INSERT INTO #serial_numbers VALUES (5, 'A07059', '1/1/2018', '10:15 PM')
    INSERT INTO #serial_numbers VALUES (6, 'A07059', '1/1/2018', '10:30 PM')
    INSERT INTO #serial_numbers VALUES (7, 'A07059', '1/1/2018', '10:45 PM')
    INSERT INTO #serial_numbers VALUES (8, 'A07059', '1/1/2018', '11:00 PM')
    INSERT INTO #serial_numbers VALUES (9, 'A07060', '1/1/2018', '11:15 PM')
    INSERT INTO #serial_numbers VALUES (10, 'A07060', '1/1/2018', '11:30 PM')
    INSERT INTO #serial_numbers VALUES (11, 'A07060', '1/1/2018', '11:45 PM')
    INSERT INTO #serial_numbers VALUES (1, 'A07060', '1/2/2018', '12:00 AM')
    INSERT INTO #serial_numbers VALUES (2, 'A07061', '1/2/2018', '12:30 AM')
    INSERT INTO #serial_numbers VALUES (3, 'A07061', '1/2/2018', '12:45 AM')
    INSERT INTO #serial_numbers VALUES (4, 'A07061', '1/2/2018', '1:00 AM')
    INSERT INTO #serial_numbers VALUES (5, 'A07061', '1/2/2018', '1:15 AM')
    INSERT INTO #serial_numbers VALUES (6, 'A07060', '1/2/2018', '1:30 AM')
    INSERT INTO #serial_numbers VALUES (7, 'A07060', '1/2/2018', '1:45 AM')
    INSERT INTO #serial_numbers VALUES (8, 'A07060', '1/2/2018', '2:00 AM')
    INSERT INTO #serial_numbers VALUES (9, 'A07060', '1/2/2018', '2:15 AM')

    SELECT
      ROW_NUMBER() OVER (Order by trans_date, trans_time) AS rn,
      index_num - ROW_NUMBER() OVER (Order by trans_date, trans_time) AS grp,
      dense_rank() over (order by serial_number) as dr,
        dense_rank() over (order by serial_number, trans_date) as drd,
      index_num - ROW_NUMBER() OVER (Order by trans_date, trans_time) - dense_rank() over (order by serial_number) as my_group,
      *
    FROM #serial_numbers
    ORDER BY
      trans_date,
      trans_time

    DROP TABLE #serial_numbers

    I can't seem to get the results to copy over in a readable format. However, running the query above will provide them so you can see what was already tried and the results.

    Thanks for your assistance.

  • WITH CTE AS (
        SELECT
            CASE WHEN LAG(serial_number, 1) OVER( ORDER BY trans_date, trans_time) = serial_number THEN 0 ELSE 1 END AS is_start,
            index_num,
            serial_number,
            trans_date,
            trans_time
        FROM #serial_numbers
    )
    SELECT
        SUM(is_start) OVER(ORDER BY trans_date, trans_time ROWS UNBOUNDED PRECEDING ) AS my_group_index,
        index_num,
        serial_number,
        trans_date,
        trans_time
    FROM CTE
    ORDER BY trans_date, trans_time;

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • Perfect! Thank you so much!

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply