Finding missing numbers

  • I have a large table of order numbers that should be consecutive. What I'd like to do is find a way to generate the missing numbers between the values that I've already got. I've got the code below to find the gaps but I'm looking for a way to fill them in. I think I'll have to use something related to the row count but I'm well aware of the perils of RBAR and loops. This is just a learning exercise that I'm doing off my own bat so I'm open to any suggestions.

    create table #Orders (OrderID int)

    insert into #Orders values

    (111101),

    (111102),

    (111103),

    (111105),

    (111106),

    (111110),

    (111111),

    (111112),

    (111113),

    (111119),

    (111125)

    ;

    with cte as(

    select

    row1 = ROW_NUMBER() over(order by sl.OrderID desc)

    ,sl.OrderID

    from #Orders sl

    group by sl.OrderID

    )

    select

    Row = ROW_NUMBER() over(order by o1.row1)

    ,diff = o2.OrderID - o1.OrderID

    ,o1.OrderID

    into #missing

    from cte o1

    join cte o2 on o1.row1 = (o2.row1+1)

    where

    (o2.OrderID - o1.OrderID) > 1

    select * from #missing

    drop table #missing,#Orders


    On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.
    —Charles Babbage, Passages from the Life of a Philosopher

    How to post a question to get the most help http://www.sqlservercentral.com/articles/Best+Practices/61537

  • I would use a tally/numbers table and then a left join.

    The numbers table will change how you look at data.

    http://www.sqlservercentral.com/articles/62867/[/url]

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Sean Lange (12/3/2013)


    I would use a tally/numbers table and then a left join.

    The numbers table will change how you look at data.

    http://www.sqlservercentral.com/articles/62867/[/url]

    I was looking at tally tables and I'd found a way that seemed to work but took 25 minutes to run! I used the method on the link, and a left join (d'oh!), and it ran in three seconds.

    Thanks for your help Sean. I've now got a much better idea about what tally tables are useful for.


    On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.
    —Charles Babbage, Passages from the Life of a Philosopher

    How to post a question to get the most help http://www.sqlservercentral.com/articles/Best+Practices/61537

  • You can also use a fuction that simulates an auxiliary table of numbers.

    Virtual Auxiliary Table of Numbers

    http://sqlmag.com/sql-server/virtual-auxiliary-table-numbers

    SET NOCOUNT ON;

    USE tempdb;

    GO

    -- Itzik's VATN

    CREATE FUNCTION dbo.GetNums(@n AS BIGINT) RETURNS TABLE

    AS

    RETURN

    WITH

    L0 AS(SELECT 1 AS c UNION ALL SELECT 1),

    L1 AS(SELECT 1 AS c FROM L0 AS A CROSS JOIN L0 AS B),

    L2 AS(SELECT 1 AS c FROM L1 AS A CROSS JOIN L1 AS B),

    L3 AS(SELECT 1 AS c FROM L2 AS A CROSS JOIN L2 AS B),

    L4 AS(SELECT 1 AS c FROM L3 AS A CROSS JOIN L3 AS B),

    L5 AS(SELECT 1 AS c FROM L4 AS A CROSS JOIN L4 AS B),

    Nums AS(SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS n FROM L5)

    SELECT TOP (@n) n FROM Nums ORDER BY n;

    GO

    create table #Orders (OrderID int PRIMARY KEY)

    insert into #Orders values

    (111101),

    (111102),

    (111103),

    (111105),

    (111106),

    (111110),

    (111111),

    (111112),

    (111113),

    (111119),

    (111125)

    ;

    WITH C1 AS (

    SELECT

    A.OrderID + 1 AS start_range,

    C.OrderID - 1 AS end_range

    FROM

    #Orders AS A

    CROSS APPLY

    (

    SELECT TOP (1)

    B.OrderID

    FROM

    #Orders AS B

    WHERE

    B.OrderID > A.OrderID

    ORDER BY

    B.OrderID

    ) AS C

    WHERE

    C.OrderID - A.OrderID > 1

    )

    SELECT

    C1.*,

    C1.start_range + (C2.n - 1) AS OrderID

    FROM

    C1

    CROSS APPLY

    dbo.GetNums((C1.end_range - C1.start_range) + 1) AS C2

    ORDER BY

    C1.start_range,

    OrderID;

    GO

    DROP TABLE #Orders;

    GO

    IF OBJECT_ID('dbo.GetNums') IS NOT NULL DROP FUNCTION dbo.GetNums;

    GO

  • Here is what I came up with. Similar to Hunchback's version but didn't use a function to get the list of values.

    with E1(N) AS (select 1 from (values (1),(1),(1),(1),(1),(1),(1),(1),(1),(1))dt(n)),

    E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows

    E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max

    cteTally(N) AS

    (

    SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4

    )

    ,MinMax as

    (

    select MIN(OrderID) as MinID, MAX(OrderID) as MaxID

    from #Orders

    )

    select *

    from cteTally t

    left join #Orders o on o.OrderID - 111000 = t.N

    cross join MinMax

    where t.N > MinID - 111000 and t.N < MaxID - 111000

    and o.OrderID is null

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Thanks Hunchback, that was actually the very method I'd used that took 25 minutes. I'd been looking around Tally Tables and my searches on here lead me to that method. The auxiliary table itself took seconds to create but querying it for the missing values took a dog's age for some reason.

    As an aside, I'm slightly surprised by how many numbers are missing from the table. I think I might have to do bit of re-building.


    On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.
    —Charles Babbage, Passages from the Life of a Philosopher

    How to post a question to get the most help http://www.sqlservercentral.com/articles/Best+Practices/61537

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply