Make Sequential numbers into ranges

  • Also, the 2nd version can be modified slightly to this (basically the same, but different):

    CREATE TABLE #Temp

    (TNID INT, TN BIGINT);

    INSERT INTO #Temp

    SELECT 3011170, 4402787100

    UNION ALL SELECT 3011170, 4402787101

    UNION ALL SELECT 3011170, 4402787102

    UNION ALL SELECT 3011170, 4402787103

    UNION ALL SELECT 3011171, 4402787104

    UNION ALL SELECT 3011171, 4402787105

    UNION ALL SELECT 3011175, 4402787118

    UNION ALL SELECT 3011175, 4402787119

    UNION ALL SELECT 3011175, 4405165555

    UNION ALL SELECT 3011175, 4405165556

    UNION ALL SELECT 3011175, 4405165557

    UNION ALL SELECT 3011175, 4405165558

    UNION ALL SELECT 3011175, 4409447976;

    with basedata as (

    select

    TNID,

    TN,

    TN - row_number() over (partition by TNID order by TN) as grp

    from

    #Temp

    )

    select

    TNID,

    min(TN) as TN1,

    max(TN) as TN2

    from

    basedata

    group by

    grp, TNID

    order by

    TNID, TN1

    ;

    go

    drop table #Temp;

    go

  • Thanks Lynn:exclamationmark: Kudos.

  • Next time you should post on the correct forum. 😉

    You could have avoided that Lynn hits his head with the desk.

    Seriously, you can get better answers if you do so.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • I did a big-table test on a couple of these.

    Test-harness:

    CREATE TABLE dbo.SequenceSkips

    (Num1 INT NOT NULL,

    Num2 INT NOT NULL,

    CONSTRAINT PK_SequenceSkips PRIMARY KEY (Num1, Num2));

    GO

    INSERT INTO dbo.SequenceSkips

    (Num1,

    Num2)

    SELECT N1.Number,

    N2.Number

    FROM Common.dbo.Numbers AS N1

    CROSS JOIN Common.dbo.Numbers AS N2

    WHERE N1.Number BETWEEN 1 AND 1000

    AND N2.Number BETWEEN 1 AND 1000;

    GO

    MERGE INTO dbo.SequenceSkips AS Tgt

    USING

    (SELECT TOP (10000)

    *

    FROM dbo.SequenceSkips

    ORDER BY NEWID()) AS Src

    ON Tgt.Num1 = Src.Num1

    AND Tgt.Num2 = Src.Num2

    WHEN MATCHED

    THEN DELETE;

    Creates a million-row table paralleling the structure of this question. Then deletes 10k rows (1%) in order to create skips in the sequence, using NewID to make the gaps effectively random.

    Here's a cursor-based solution, as per what I proposed:

    IF OBJECT_ID(N'tempdb..#T') IS NOT NULL

    DROP TABLE #T;

    SET NOCOUNT ON;

    DECLARE Cur CURSOR LOCAL FAST_FORWARD

    FOR

    SELECT Num1,

    Num2

    FROM dbo.SequenceSkips

    ORDER BY Num1,

    Num2;

    OPEN Cur;

    DECLARE @N1 INT,

    @N2 INT,

    @LastN1 INT,

    @LastN2 INT;

    FETCH NEXT FROM Cur INTO @N1, @N2;

    CREATE TABLE #T

    (N1 INT,

    RangeStart INT,

    RangeEnd INT);

    SELECT @LastN1 = @N1,

    @LastN2 = @N2;

    INSERT INTO #T

    (N1, RangeStart, RangeEnd)

    VALUES (@N1, @N2, NULL);

    WHILE @@FETCH_STATUS = 0

    BEGIN

    FETCH NEXT FROM Cur INTO @N1, @N2;

    IF @N1 > @LastN1

    OR @N2 > @LastN2 + 1

    BEGIN;

    UPDATE #T

    SET RangeEnd = @LastN2

    WHERE RangeEnd IS NULL;

    INSERT INTO #T

    (N1, RangeStart, RangeEnd)

    VALUES (@N1, @N2, NULL);

    END;

    SELECT @LastN1 = @N1,

    @LastN2 = @N2;

    END;

    CLOSE Cur;

    DEALLOCATE Cur;

    SELECT *

    FROM #T;

    On my desktop machine, which is pretty heavy-duty (quad-core, 16 Gig of RAM, etc.), this took an average of 13 seconds per test run. That includes returning the final data, which came out to 10,879 rows in my test (may be different in other tests, because of the nature of the test-harness).

    A recursive CTE version, based on Lynn's first concept, but with MaxRecursion 0 set:

    WITH rCTE

    AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) rn,

    t1.Num2,

    t1.Num1

    FROM dbo.SequenceSkips t1

    LEFT OUTER JOIN dbo.SequenceSkips t2

    ON t1.Num1 = t2.Num1

    AND t1.Num2 = t2.Num2 + 1

    WHERE t2.Num1 IS NULL

    UNION ALL

    SELECT rn,

    t1.Num2,

    t1.Num1

    FROM dbo.SequenceSkips t1

    INNER JOIN rCTE t2

    ON t1.Num2 = t2.Num2 + 1

    AND t1.Num1 = t2.Num1)

    SELECT Num1,

    MIN(Num2) TN1,

    MAX(Num2) TN2

    FROM rCTE

    GROUP BY rn,

    Num1

    ORDER BY rn,

    Num1

    OPTION (MAXRECURSION 0);

    Took an average of 16 seconds over multiple runs.

    Lynn's second query:

    WITH basedata

    AS (SELECT Num1,

    Num2,

    Num2 - ROW_NUMBER() OVER (PARTITION BY Num1 ORDER BY Num2) AS grp

    FROM dbo.SequenceSkips)

    SELECT Num1,

    MIN(Num2) AS TN1,

    MAX(Num2) AS TN2

    FROM basedata

    GROUP BY grp,

    Num1

    ORDER BY Num1, TN1;

    Was under one second.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • I was investigating the thought process on the 2nd query (using row_num) and I was impressed how Lynn came up with the grp field "TN - row_number() over (partition by TNID order by TN) as grp". Nice thinking out of the box :w00t:

    Here's the raw data before the MIN() and MAX().

    TNIDTN grp rowNUM

    3011170440278710044027870991

    3011170440278710144027870992

    3011170440278710244027870993

    3011170440278710344027870994

    3011171440278710444027871031

    3011171440278710544027871032

    3011175440278711844027871171

    3011175440278711944027871172

    3011175440516555544051655523

    3011175440516555644051655524

    3011175440516555744051655525

    3011175440516555844051655526

    3011175440944797644094479697

  • fsr645 (8/28/2012)


    I was investigating the thought process on the 2nd query (using row_num) and I was impressed how Lynn came up with the grp field "TN - row_number() over (partition by TNID order by TN) as grp". Nice thinking out of the box :w00t:

    Here's the raw data before the MIN() and MAX().

    TNIDTN grp rowNUM

    3011170440278710044027870991

    3011170440278710144027870992

    3011170440278710244027870993

    3011170440278710344027870994

    3011171440278710444027871031

    3011171440278710544027871032

    3011175440278711844027871171

    3011175440278711944027871172

    3011175440516555544051655523

    3011175440516555644051655524

    3011175440516555744051655525

    3011175440516555844051655526

    3011175440944797644094479697

    Had some help from this article: http://www.sqlservercentral.com/articles/T-SQL/71550/.

Viewing 6 posts - 16 through 20 (of 20 total)

You must be logged in to reply to this topic. Login to reply