How to find missing number ??

  • create table QA_Test (ID int,BID varchar(5),NumBldgs int)

    Insert into QA_Test

    Select 123,'001',4

    Union all

    Select 123,'002',4

    Union all

    Select 123,'004',4

    Union all

    Select 123,'005',4

    Union all

    Select 124,'001',3

    Union all

    Select 124,'002',3

    Union all

    Select 124,'003',3

    Union all

    Select 125,'001',3

    Union all

    Select 125,'003',3

    Union all

    Select 125,'004',3

    In the above table I need to find the missing BID with respect to ID.If the number of NumBldgs id not equal to the count of BID of that ID. Please help.

    Thanks

  • well from your description and the data added then there are no missing BID

    id 123 has count of 4 and there are 4 rows,

    id 124 has a count of 3 and there are 3 rows

    id 125 has a count of 3 and there are 3 rows

    so what are you looking for what would be your expected results?

    ***The first step is always the hardest *******

  • Thanks for the catch :hehe:

    Actually

    for 123 I need to find the missing sequence BID '003'

    for 125 I need to find the missing sequence BID '002'

    please ignore numbldgs column.

    Thanks

  • I think someone wrote an article about this problem. Search this site for "gaps and islands" and you should find it.

    John

  • thsi will help you identify where the gaps are

    ;with seqcheck

    as

    (

    select row_number() over(partition by id order by id ) as rn, id,bid from qa_test

    )

    select b.bid,a.bid

    from seqcheck a inner join seqcheck b on a.rn=b.rn+1

    and a.id=b.id

    where a.bid!=b.bid+1

    ***The first step is always the hardest *******

  • The data type char(6) is difficult to tell what is missing in this case.

    What is the logic that determines '003' is missing from 123?

    Let's say we have '010', '030', what is/are missing item(s)? '011','012'....'029' or just '020'?

    First step would be really decide what your gap logic is.

    Let's say this field can be treated as numbers and you're looking for continuous numbers for same ID. You could either use the query above or this one:

    select

    t1.id,

    t1.bid + 1 as start_missing_bid,

    MIN(t2.bid) -1 as end_missing_bid

    from

    qa_test t1

    inner join qa_test t2 on t2.id = t1.id and t2.bid >t1.bid

    group by

    t1.id,

    t1.bid

    having

    t1.bid + 1 < MIN(t2.bid)

    order by

    t1.id

    If this BID field cannot be treated as numbers, you might have to define your own function to determine whether 2 BID values are continuous.

    Also, Using an auxiliary table could be more efficient in identifying gaps if you have a big table.

    ----------------------------------------------------------------------------------------------
    Microsoft Certified Solution Master: Data Platform, Microsoft Certified Trainer
    Email: Louis.Li@rrlminc.com | Blog[/url] | LinkedIn[/url]

  • I need the result like this..all missing number between '001' to max(bid)

    IDBID

    1232

    1253

    1256

    1283

    If we have '010', '030', Missing items will be '011','012'....'029' .

  • CELKO (8/9/2012)


    You also need to start using the ANSI Standard row constructor syntax instead of the old Sybase 1970's stuff you had:

    INSERT INTO QA_Tests

    VALUES (123, '001', 4),

    (123, '002', 4),

    (123, '004', 4),

    (123, '005', 4),

    (124, '001', 3),

    (124, '002', 3),

    (124, '003', 3),

    (125, '001', 3),

    (125, '003', 3),

    (125, '004', 3);

    The above works fine in SQL Server 2008, but if someone only has access to SQL Server 2005 (or earlier) it doesn't.

    Yes, I know that this is a SQL Server 2008 forum, but I have been called on this before as well.

  • Here is a nice little writeup on detecting gaps and some methods that can be used

    http://blog.waynesheffield.com/wayne/archive/2012/04/sql-server-2012-performance-test-gap-detection/

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • nikki_d (8/9/2012)


    I need the result like this..all missing number between '001' to max(bid)

    IDBID

    1232

    1253

    1256

    1283

    If we have '010', '030', Missing items will be '011','012'....'029' .

    Looks like this BID is treated as number. If you are looking for each individual missing BID, I would create a supplemental table first:

    declare @num as int = 1;

    if OBJECT_ID('aux_numbers') is not null drop table aux_numbers

    create table aux_numbers

    (

    Number int

    )

    insert into aux_numbers values(1)

    While @num < 100000

    begin

    insert into aux_numbers

    select Number+@num from aux_numbers

    set @num = @num * 2

    end

    Then we can go ahead to generate the list we want:

    If max(bid) you mentioned is by ID, the query is simpler:

    with GapRange as

    (

    select

    t1.id,

    t1.bid + 1 as start_missing_bid,

    MIN(t2.bid) -1 as end_missing_bid

    from

    QA_Test t1

    inner join QA_Test t2 on (t2.id = t1.id and t2.bid >t1.bid)

    group by

    t1.id,

    t1.bid

    having

    t1.bid + 1 < MIN(t2.bid)

    )

    select g.id, n.number as missing_bid from aux_numbers n inner join GapRange g on n.number >= g.start_missing_bid and n.number <= g.end_missing_bid

    order by g.id

    And it yields:

    idmissing_bid

    1233

    1252

    1253

    However, it max(BID) means max BID for entire table,we will have to add extra logic. I decided to steal SGT_squeequal's query and make some modifications:

    with Gaps as

    (

    select

    ID,

    bid,

    ROW_NUMBER() over (partition by id order by ID, BID) as rn,

    q2.max_bid

    from

    qa_test q1 cross join (SELECT MAX(BID) AS max_bid FROM QA_Test) q2

    ),

    GapRange as

    (

    select

    g1.id,

    g1.bid + 1 as start_missing_bid,

    g2.bid - 1 as end_missing_bid

    from gaps g1 inner join gaps g2 on g1.rn +1 = g2.rn and g1.id = g2.id

    where g1.bid +1 <> g2.bid

    union all

    select

    g3.id,

    MAX(g3.bid) + 1 as start_missing_bid,

    g3.max_bid as end_missing_bid

    from

    gaps g3

    group by

    g3.id, g3.max_bid

    having

    MAX(g3.bid) + 1 <= g3.max_bid

    )

    select ID,number from gaprange inner join aux_numbers on Number >= start_missing_bid and Number <= end_missing_bid

    order by ID

    This query shows 2 more rows than last one:

    IDnumber

    1233

    1244

    1245

    1252

    1253

    The differece is in the second CTE which unions missing numbers from global max BID by ID.

    Hope this helps.

    ----------------------------------------------------------------------------------------------
    Microsoft Certified Solution Master: Data Platform, Microsoft Certified Trainer
    Email: Louis.Li@rrlminc.com | Blog[/url] | LinkedIn[/url]

  • A second though on second scanario: max BID of table, this query looks clean (still need aux_numbers table):

    with FullBID as

    (

    select ID,

    number

    from (select distinct ID from QA_Test) q cross join aux_numbers

    where number <= (SELECT MAX(BID) FROM QA_Test)

    )

    select f.ID,f.Number from FullBID f left join QA_Test q on f.ID = q.ID and CAST(q.BID as int) = f.Number

    where q.ID is null

    order by f.ID, f.number

    I suggest you try both and see which is better as query plan shows this one is less efficient than the other one.

    ----------------------------------------------------------------------------------------------
    Microsoft Certified Solution Master: Data Platform, Microsoft Certified Trainer
    Email: Louis.Li@rrlminc.com | Blog[/url] | LinkedIn[/url]

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply