Group Numbering

  • I'm having brain fade today, I'm trying to give Groups of Data a number rather than the Rows within the Group.

    I have a table:

    CREATE TABLE dbo.EventRecord
    (
    EventRecordID INT IDENTITY(1,1) PRIMARY KEY,
    MachineID INT,
    SequenceNumber INT,
    RecordedUTC DATETIME,
    EventTypeID INT
    );

    For each MachineID the Sequence Number should increment by for each Event recorded.

    However if the Machine has encountered an error the Sequence Number will not be sequential, it will either have missed a few out or reset to 0.

    I need to report on the periods of running without error, so where the SequenceNumber int he next row isn't 1 greater than the current SequenceNumber it should form a group.

    I've been trying to give each group a unique number and subsequently I can get the MIN and MAX RecordedUTC for each group.

    So far I can establish if the next row is following the sequence:

    SELECT *,
    CASE LAG(SequenceNumber,1,-1) OVER(ORDER BY RecordedUTC asc)
    WHEN sequencenumber - 1 THEN 1
    ELSE 0 END AS SequentialEvents
    FROM dbo.EventRecord
    WHERE MachineID = 1

    Therefore I need to start a new Group Number every time SequentialEvents = 0

    So on the following screen shot I need Rows 1 to 7 to be Group1, Rows 8 to 13 to be Group2, 14 onwards to be Group3

    Forum

    I'm positive I've done this a thousand times before, but as I say, I'm suffering from brain fade so any help is appreciated.

    Regards

    Giles

  • This is a classic (Gaps and) Islands problem.  The following will not give you sequential group numbers, but it will give you unique group numbers.

    GroupNum = SequenceNumber - ROW_NUMBER() OVER (PARTITION BY MachineID ORDER BY EventRecordID)

    • This reply was modified 4 years, 10 months ago by  DesNorton.
    • This reply was modified 4 years, 10 months ago by  DesNorton.
    • This reply was modified 4 years, 10 months ago by  DesNorton.
  • Perfect, suits my needs perfectly thanks.

    Just to expand for anyone reading this in future:

    Whilst in my example the result of ROW_NUMBER can not be greater than the SequenceNumber if the machine reset its SequenceNumber we would start again at 1.

    Your formula still works giving a unique Group Number but it will be negative. But as I say it's fine for me.

  • giles.clapham wrote:

    Perfect, suits my needs perfectly thanks. Just to expand for anyone reading this in future: Whilst in my example the result of ROW_NUMBER can not be greater than the SequenceNumber if the machine reset its SequenceNumber we would start again at 1. Your formula still works giving a unique Group Number but it will be negative. But as I say it's fine for me.

     

    Correct.  I created the following sample data which reset to 0.  This resulted in a negative group number, but there was no requirement for a positive group number.

    CREATE TABLE #EventRecord (
    EventRecordID int /*IDENTITY(1, 1)*/ PRIMARY KEY
    , MachineID int
    , SequenceNumber int
    , RecordedUTC datetime
    , EventTypeID int
    );

    INSERT INTO #EventRecord ( EventRecordID, MachineID, SequenceNumber, RecordedUTC, EventTypeID )
    VALUES ( 1, 1, 1, '2019-06-14 08:59:42.370', 49 )
    , ( 2, 1, 2, '2019-06-14 09:00:42.370', 120 )
    , ( 3, 1, 3, '2019-06-14 09:01:42.370', 157 )
    , ( 4, 1, 4, '2019-06-14 09:02:42.370', 753 )
    , ( 5, 1, 5, '2019-06-14 09:03:42.370', 286 )
    , ( 6, 1, 6, '2019-06-14 09:04:42.370', 69 )
    , ( 7, 1, 7, '2019-06-14 09:05:42.370', 958 )
    , ( 11, 1, 11, '2019-06-14 09:09:42.370', 511 )
    , ( 12, 1, 12, '2019-06-14 09:10:42.370', 235 )
    , ( 13, 1, 13, '2019-06-14 09:11:42.370', 932 )
    , ( 14, 1, 14, '2019-06-14 09:12:42.370', 249 )
    , ( 15, 1, 15, '2019-06-14 09:13:42.370', 930 )
    , ( 16, 1, 0, '2019-06-14 09:14:42.370', 304 )
    , ( 18, 1, 1, '2019-06-14 09:16:42.370', 714 )
    , ( 19, 1, 2, '2019-06-14 09:17:42.370', 569 )
    , ( 20, 1, 3, '2019-06-14 09:18:42.370', 243 )
    , ( 21, 1, 4, '2019-06-14 09:19:42.370', 683 )
    , ( 22, 1, 5, '2019-06-14 09:20:42.370', 699 )
    , ( 23, 1, 6, '2019-06-14 09:21:42.370', 558 );
  • To avoid the "0" and negative problem, change it from using the sequence number to using the event record ID.  In the example you have above, it looks like that should solve the problem.

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • Mr. Brian Gale wrote:

    To avoid the "0" and negative problem, change it from using the sequence number to using the event record ID.  In the example you have above, it looks like that should solve the problem.

    Using the event record ID instead of the sequence number won't give the same results.  Specifically, the record with event record ID 16 will be in different groups depending on which field you use.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • That is true.  I suppose then it all depends on what you define as a "sequence".  If having the "Sequence ID" reset itself is a common occurrence, AND if resetting the sequence ID results in a new group, I imagine the fix for that using the sequence ID would be a bit more complex and likely involve using a CTE or 2.

    In the initial example, it seemed like the sequence ID and the event record ID corresponded to each other.  But looking at the table definition, it looks like in most cases, that would not be the case.  So my suggestion has a few assumptions (ie resetting the sequence ID does not result in a new group) and would only work with the sample data provided, not with the table definition.

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply