Problem with ranking

  • Hi,

    I have situation like this

    081100001

    081100002

    081100003

    .

    .

    .

    081100999

    in this col I have the values that is not in order and they are missing how can I identify the broken numbers

    081100001

    081100002

    081100003

    -081100004 is missing

    081100005

    081100006

    081100007

    any idea how to find the solution!

    thnx!

    :hehe:

    ============================================================
    SELECT YOUR PROBLEM FROM SSC.com WHERE PROBLEM DESCRIPTION =
    http://www.sqlservercentral.com/articles/Best+Practices/61537/[/url]

  • http://www.sqlservercentral.com/articles/SQL+Puzzles/findinggapsinasequentialnumbersequence/2336/

    http://www.sqlservercentral.com/articles/Datetime+Manipulation/61822/

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • Interesting ... but little bit confused couz for the moment I haven't any idea how to use it in my column ...my column is not datetime it is just a text created from two digits of year '08', two digits from the number of office '11' and the others are sequential 00001 ....

    then how to use it in text column not datetime!

    thnx anyway!

    ============================================================
    SELECT YOUR PROBLEM FROM SSC.com WHERE PROBLEM DESCRIPTION =
    http://www.sqlservercentral.com/articles/Best+Practices/61537/[/url]

  • Should I collect data from WITH clause or where to put my column for compering!

    ============================================================
    SELECT YOUR PROBLEM FROM SSC.com WHERE PROBLEM DESCRIPTION =
    http://www.sqlservercentral.com/articles/Best+Practices/61537/[/url]

  • Dugi (9/15/2008)


    Interesting ... but little bit confused couz for the moment I haven't any idea how to use it in my column ...my column is not datetime it is just a text created from two digits of year '08', two digits from the number of office '11' and the others are sequential 00001 ....

    then how to use it in text column not datetime!

    thnx anyway!

    Darn key algorithms :pinch:

    So you should build a potential cte from 00001 to 99999 using a tally table.

    www.sqlservercentral.com/articles/TSQL/62867/

    and then launch this:

    Select *

    from yourtable T1

    where not exists (Select *

    from tempdb.dbo.Tally T

    where T.N = convert(integer,substring(T1.keycol, 5,5))

    )

    order by T1.Keycol

    or use its left join equivalent

    Select T1.*

    from yourtable T1

    let join tempdb.dbo.Tally T

    ON T.N = convert(integer,substring(T1.keycol, 5,5))

    Where T.N is null

    order by T1.Keycol

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • I cannot understand how to find the solution here, I'm doing it also when I export the data in Excel and using the function for comparing!

    Is it possible to generate the N as ROW_NUMBER() or taking from Tally table, but if I have the broken number in my column the N will starts from 1 again, so if I retrieve the data like this, I'm sure that the second possibility is to select the data where N = 1 or N in (1,2) just to see the broken numbers!

    Thnx anyway!

    Dugi

    ============================================================
    SELECT YOUR PROBLEM FROM SSC.com WHERE PROBLEM DESCRIPTION =
    http://www.sqlservercentral.com/articles/Best+Practices/61537/[/url]

  • Oh boy, I messed up pretty well :blush:

    I'll get back on this if I find some time today ...

    For sure this way it will work, although it may take a while

    Declare @YourTable table ( TheMeaningfullKey char(9) not null primary key, tsCreate datetime not null default getdate())

    Declare @TheFullList table ( TheMeaningfullKey char(9) not null primary key , YYRR as substring(TheMeaningfullKey,1,4) )

    set nocount on

    insert into @YourTable ( TheMeaningfullKey ) values ('081100001')

    insert into @YourTable ( TheMeaningfullKey ) values ('081100002')

    insert into @YourTable ( TheMeaningfullKey ) values ('081100003')

    -- insert into @YourTable ( TheMeaningfullKey ) values ('081100004') is missing

    insert into @YourTable ( TheMeaningfullKey ) values ('081100005')

    insert into @YourTable ( TheMeaningfullKey ) values ('081200006') -- another RR

    insert into @YourTable ( TheMeaningfullKey ) values ('081100007')

    set nocount off

    Insert into @TheFullList

    Select distinct substring(T0.TheMeaningfullKey, 1,4) + RIGHT(T.N+1000000,5)

    from @YourTable T0

    , tempdb.dbo.Tally T

    where substring(T0.TheMeaningfullKey, 1,4) + RIGHT(T.N+1000000,5)

    < (Select max(T1.TheMeaningfullKey) max_TheMeaningfullKey

    from @YourTable T1

    Where substring(T1.TheMeaningfullKey, 1,4) = substring(T0.TheMeaningfullKey, 1,4) )

    Select F.*

    from @TheFullList F

    left join @YourTable T0

    on T0.TheMeaningfullKey = F.TheMeaningfullKey

    where T0.TheMeaningfullKey is null

    order by F.TheMeaningfullKey

    /*

    *

    * http://www.sqlservercentral.com/articles/TSQL/62867/#

    *

    * The "Numbers" or "Tally" Table: What it is and how it replaces a loop.

    * By Jeff Moden, 2008/05/07

    *

    */

    /*

    USE TempDB

    --DB that everyone has where we can cause no harm

    SET NOCOUNT ON

    --Supress the auto-display of rowcounts for appearance/speed

    DECLARE @StartTime DATETIME

    --Timer to measure total duration

    SET @StartTime = GETDATE()

    --Start the timer --=============================================================================

    -- Create and populate a Tally table

    --=============================================================================

    --===== Conditionally drop

    IF OBJECT_ID('dbo.Tally') IS NOT NULL

    DROP TABLE dbo.Tally

    --===== Create and populate the Tally table on the fly

    SELECT TOP 99999 --equates to more than 30 years of dates

    IDENTITY( INT,1,1 ) AS N

    INTO dbo.Tally

    FROM Master.dbo.SysColumns sc1

    , Master.dbo.SysColumns sc2

    --===== Add a Primary Key to maximize performance

    ALTER TABLE dbo.Tally

    ADD CONSTRAINT PK_Tally_N PRIMARY KEY CLUSTERED ( N ) WITH FILLFACTOR = 100

    --===== Let the public use it

    GRANT SELECT, REFERENCES ON dbo.Tally TO PUBLIC

    --===== Display the total duration

    SELECT STR(DATEDIFF(ms, @StartTime, GETDATE())) + ' Milliseconds duration'

    */

    /* MY results

    (11 row(s) affected)

    TheMeaningfullKey YYRR

    ----------------- ----

    081100004 0811

    081100006 0811

    081200001 0812

    081200002 0812

    081200003 0812

    081200004 0812

    081200005 0812

    */

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • This one takes way to long !!!

    Select RIGHT(F.[Number]+1000000000,9) as TheMissingKey

    from ( Select substring(T1.TheMeaningfullKey, 1,4) as YYRR , max(T1.TheMeaningfullKey) as max_TheMeaningfullKey

    from @YourTable T1

    group by substring(T1.TheMeaningfullKey, 1,4) ) M

    cross apply dbo.ufn_GetNumbers(M.YYRR + '00000', M.max_TheMeaningfullKey) F -- http://www.sqlservercentral.com/articles/Datetime+Manipulation/61822/

    where not exists (Select * from @YourTable T2 where T2.TheMeaningfullKey = RIGHT(F.[Number]+1000000000,9))

    order by TheMissingKey

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • ALZDBA

    Thank you, you are brilliant with your codes here ... I confused and I'm just looking what happened ...I will try later in my table couz no way to find my solution!

    Many thnx!

    Dugi

    ============================================================
    SELECT YOUR PROBLEM FROM SSC.com WHERE PROBLEM DESCRIPTION =
    http://www.sqlservercentral.com/articles/Best+Practices/61537/[/url]

  • Dugi,

    I'll make you a deal... if you provide the test table and the test data IAW the link in my signature, I'll show you a method that will blow the doors off the other methods including the Tally table method... and I'm the one that loves Tally tables the most! 😛

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thnx Jeff I appreciate that deal ....also I found the solution with Tally table but it would be better if you post the alternative solutions with tally table!

    I started to love that tally table!

    You can see here one of the solution that I have from Michael Earl, but he found the solution after the ALZDBA suggest for tally table maybe!

    Sample data and Structure of the Table:

    There is the structure of the table:

    TR_ID NVARCHAR(15)

    TR_YEAR NVARCHAR(2)

    TR_OFFICE NVARCHAR(4)

    TR_DATECREATED DATETIME

    Sample Data:

    TR_ID TR_YEAR TR_OFFICE TR_DATECREATED (DD/MM/YYYY)

    082011000000001 08 2011 01/01/2008

    082011000000002 08 2011 01/01/2008

    082011000000003 08 2011 02/01/2008

    082011000000005 08 2011 02/01/2008

    and the one of the solution is:

    CREATE TABLE #MyTest

    (

    MyID VARCHAR(15)

    )

    INSERT #MyTest VALUES ('082011000000001')

    INSERT #MyTest VALUES ('082011000000002')

    INSERT #MyTest VALUES ('082011000000003')

    INSERT #MyTest VALUES ('082011000000005')

    INSERT #MyTest VALUES ('082011000000006')

    ; WITH MyList (Val)

    AS (SELECT N + 82011000000000 FROM Tally)

    SELECT

    M.MyID

    , CONVERT(BIGINT,M.MyID) AS MyIDInt

    , T.Val

    FROM

    MyList T

    LEFT JOIN #MyTest M ON T.Val = CONVERT(BIGINT,M.MyID)

    WHERE

    T.Val BETWEEN (SELECT MIN(CONVERT(BIGINT,MyID)) FROM #MyTest)

    AND (SELECT MAX(CONVERT(BIGINT,MyID)) FROM #MyTest)

    So if you have the alternatives ok! I hate the situation why confusion is online with me in these cases!

    Many thnx Jeff!

    Dugi

    ============================================================
    SELECT YOUR PROBLEM FROM SSC.com WHERE PROBLEM DESCRIPTION =
    http://www.sqlservercentral.com/articles/Best+Practices/61537/[/url]

  • I didn't forget ya... I just had a really long day and haven't been able to do the necessary writeup along with the code. I'll finish this when I get home later "tonight". It's 01:25 here right now and I need a nap. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Heh… you know me… I love the Tally table my own self! But, as I've said many times, the Tally table is NOT a panacea.

    Consider this… in your example, you have the character version of some pretty hefty numbers starting at 082011000000001. All the Tally table solutions work fine until someone gets the great idea to add another set of ID's starting at 092011000000001… that's gonna make either for a helluva big Tally table or you'll need to have two sections of code. If they add another "partition" of numbers, then you have to add another section of code because you won't have the time left in your life to grow a Tally table that big nor enough disk space ever.

    No, you have to plan on people doing things like that and that means you have to plan on being able to find the missing ID's in an unlimited fashion. Trust me on this one… plan on it happening.

    Obviously, the other problem is performance. Couple of problems with the Tally table solution there… you want to print out or join to a bazillion missing numbers on a split partition set of insanely large numbers? I don't think so. Again, that'll force you to add new sections of code to add new partitions of ID's.

    Code should be trouble free and you shouldn't have to remember to go add a new section of code every time someone wants to add a new partition of ID's.

    We've already seen a CROSS APPLY solution blow it's gears even on small sets. Tally table won't handle unlimited differences in numbers. A cursor or While loop is definitely out of the question because you could travel light years before it gets done with such huge numbers. And, we don't want the code to ever have to be touched just because some designer wants to partition the Ids.

    Believe it or not, we have to use a RBAR trick that I normally frown on pretty hard… we have to use a correlated subquery like some have tried to use Cross Apply. Because we're finding RANGES of missing numbers, it turns out to be nasty fast and can actually do Index Seeks instead of Table Scans or Index Scans.

    Basically, the code finds the leading edge of all "islands" and does a couple of very simple calculations to find the ranges of "gaps". The following code example works with the data you provided…

    --===== Declare some variables to force implicit conversions to BigInt

    -- because the MyID column is VARCHAR and won't implicity convert

    -- when simple contansts for 1 or 0 are present.

    DECLARE @Zero BIGINT, @One BIGINT

    SELECT @Zero = 0, @One = 1

    --===== Find the "gap ranges"

    -- Finds leading edge of "islands" and then computes the gaps

    SELECT GapStart = (SELECT ISNULL(MAX(b.MyID),@Zero)+@One

    FROM #MyTest b

    WHERE b.MyID < a.MyID),

    GapEnd = MyID - @One

    FROM #MyTest a

    WHERE a.MyID NOT IN (SELECT MyID+@One FROM #MyTest)

    AND a.MyID > (SELECT MIN(CONVERT(BIGINT,MyID)) FROM #MyTest)

    Now, the real problem is that you have a VARCHAR for an ID column and it appears that you somehow think it's important to reuse the numbers to keep from having gaps, which is why you wanted to build a gap detector in the first place. Sure, just a guess on my part, but that's the usual reason. As you can see, you can have huge numbers for ID columns, so why go through all the headaches of doing so? Your current "serial number" will handle up to a billion different numbers without infringing on the non-zero digits in the first number in your example data. I can, however, understand the need to display the leading zero.

    With all of that in mind, let's combine the prerequisite performance test I often to with a method to use the benefits of an auto-numbering column and still display the leading zero's as in your original data example…

    Here's the test data... details are in the comments, as usual...

    --===== Create and populate a 1,000,000 row test table.

    -- Column "MyID" has a range of 1 to 1,000,000 unique numbers starting at 82011000000001

    -- Jeff Moden

    SELECT TOP 1000000

    MyID = IDENTITY(BIGINT,82011000000001,1)

    INTO #MyTest

    FROM Master.dbo.SysColumns sc1,

    Master.dbo.SysColumns sc2

    --===== A table is not properly formed unless a Primary Key has been assigned

    -- Takes about 3 seconds to execute.

    ALTER TABLE #MyTest

    ADD PRIMARY KEY CLUSTERED (MyID)

    --===== Now, let's add a calculated column to display leading zeros in a "field" of 15 characters

    -- like the original problem.

    ALTER TABLE #MyTest

    ADD DisplayMyID AS RIGHT('000000000000000'+CAST(MyID AS VARCHAR(15)),15)

    --===== Delete some know rows to demo the gap detection code

    -- This deletes 50 rows spaced 2000 apart in the given range

    -- to demo small gaps

    DELETE #MyTest

    WHERE MyID BETWEEN 82011000500001 AND 82011000600000

    AND MyID %2000 = 0

    -- This deletes 100,000 rows in a given range to demo large gaps

    DELETE #MyTest

    WHERE MyID BETWEEN 82011000600001 AND 82011000700000

    Here's a solution that very quickly finds all gaps including the gap that starts at a value of "1" and only takes a second to run...

    --===== Find the "gap ranges"

    -- Finds leading edge of "islands" and then computes the gaps

    -- This assumes that gaps include any whole number greater than 0

    SELECT GapStart = (SELECT ISNULL(MAX(b.MyID),0)+1

    FROM #MyTest b

    WHERE b.MyID < a.MyID),

    GapEnd = MyID - 1

    FROM #MyTest a

    WHERE a.MyID NOT IN (SELECT MyID + 1 FROM #MyTest)

    ... and here's a solution that finds gaps starting with the first number assuming that you won't use "1" as an ID... again, it does its deed in a second...

    --===== Find the "gap ranges"

    -- Finds leading edge of "islands" and then computes the gaps

    -- This assumes that gaps include only from the first number on

    -- and does so without slowing the code down

    SELECT *

    FROM (--==== Find all the gaps like before

    SELECT GapStart = (SELECT ISNULL(MAX(b.MyID),0)+1

    FROM #MyTest b

    WHERE b.MyID < a.MyID),

    GapEnd = MyID - 1

    FROM #MyTest a

    WHERE a.MyID NOT IN (SELECT MyID + 1 FROM #MyTest))g

    WHERE g.GapStart > 1

    Lemme know if you have any questions on this... I gotta get ready for work.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • O man, O man ...no comment!

    :w00t::w00t::w00t::w00t::w00t::w00t::w00t:

    ============================================================
    SELECT YOUR PROBLEM FROM SSC.com WHERE PROBLEM DESCRIPTION =
    http://www.sqlservercentral.com/articles/Best+Practices/61537/[/url]

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply