Sequential Ordering of a Column

  • I have a table that contains a SortOrder field that allows the user to re-arrange the order in which items are returned by the query. 

    tbl_Names

    Name   SortOrder

    ------  ---------

    Jeff          1

    Tom         2

    Larry        5

    Emily       11

    The gaps in the Sort order are caused by records being deleted.

    How can I write a query that renumbers the SortOrder column and preserves the original order.  The result of the Update should be:

    Name   SortOrder

    ------  ---------

    Jeff          1

    Tom         2

    Larry        3

    Emily        4

     

    I apologize if this is a trivial question.  I cant seem to get my arms around it.

     

    Thanks,

    Steve

  • UPDATE A

    SET A.SortOrder =

    (SELECT COUNT(*) FROM tbl_Names Z WHERE Z.SortOrder <= A.SortOrder)

    FROM tbl_Names A

  • Dont worry... it's not a trivial question...

    There's a half dozen or more ways to do such a thing... Koji posted one of them... but, beware... Koji's code relies on a "triangular join" which is a bit more than half of a full cross-join... it will simply "die" after you get about 10,000 rows in the table.

    So, my question would be, how many rows are in the table you need to do this on?  Also, why do you need to do this?  An INT column will handle over 2 billion numbers... why are you trying to recover the missing numbers?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Make use of the new ROW_NUMBER() function available in SQL Server 2005.

    And depending on your business rules, RANK() or DENSE_RANK() would be helpful too.

     


    N 56°04'39.16"
    E 12°55'05.25"

  • Hi, Jeff,

    I would like to study on "triangular join."

    Would you point me to right direction?

    Where can get some information?

  • 'So, my question would be, how many rows are in the table you need to do this on?  Also, why do you need to do this?  An INT column will handle over 2 billion numbers... why are you trying to recover the missing numbers?'

    There are over 100k rows in the table but the reordering just acts upon a subset of these at any one time.  No more than 1000 would be in any subset.   The reason for this renumbering is that the application allows a user to insert records before or after the specified record and the current application logic is all done based on ordinal position.  So to insert a new record before record number 3 requires all records greater than 3 to be shifted down one and the new record to be inserted as SortOrder 3.

    Can you give me an example of another type of solution to this so I can study the various approaches?  Thanks!

    Steve

     

  • DECLARE

    @Sample TABLE (Name VARCHAR(10), SortOrder INT)

    INSERT

    @Sample

    SELECT 'Jeff', 1 UNION ALL

    SELECT 'Tom', 2 UNION ALL

    SELECT 'Larry', 5 UNION ALL

    SELECT 'Emily', 11

    SELECT

    * FROM @Sample ORDER BY SortOrder

    UPDATE s

    SET s.SortOrder = x.RecID

    FROM @Sample AS s

    INNER JOIN (

    SELECT Name,

    SortOrder,

    ROW_NUMBER() OVER (ORDER BY SortOrder) AS RecID

    FROM @Sample

    ) AS x ON x.Name = s.Name AND x.SortOrder = s.SortOrder

    SELECT

    * FROM @Sample ORDER BY SortOrder

     


    N 56°04'39.16"
    E 12°55'05.25"

  • Not sure where to look, Koji... it's mostly personal experience.  Like I said, it's really half a cross-join... I guess my suggestion would to be to readup on cross-joins and correlated sub-queries.

    If you don't come up with anything worth while on those two searches, post back and I'll write something up...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • heh... I guess I'm just gonna have to break down and upgrade one of these near days...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • A simpler method!

    DECLARE

    @Sample TABLE (Name VARCHAR(10), SortOrder INT)

    INSERT

    @Sample

    SELECT 'Jeff', 1 UNION ALL

    SELECT 'Tom', 2 UNION ALL

    SELECT 'Larry', 5 UNION ALL

    SELECT 'Emily', 11

    SELECT

    * FROM @Sample ORDER BY SortOrder

    UPDATE

    s

    SET s.SortOrder = s.RecID

    FROM (

    SELECT SortOrder,

    ROW_NUMBER() OVER (ORDER BY SortOrder) AS RecID

    FROM @Sample

    ) AS s

    SELECT * FROM @Sample ORDER BY SortOrder

     


    N 56°04'39.16"
    E 12°55'05.25"

  • Hi there, Koji,

    Got your private message and thought I'd post my reply here so that others can see it, as well...

    Koji,

    There's not much to a Triangular Join (least, not up front)... like I said, it's a bit more than half a cross join...

    Lets say you have two tables, A and B, each with 4 rows numbered 1 through 4 like this...

     CREATE TABLE TableA (RowNum INT)

     INSERT INTO TableA (RowNum)

     SELECT 1 UNION ALL

     SELECT 2 UNION ALL

     SELECT 3 UNION ALL

     SELECT 4

     CREATE TABLE TableB (RowNum INT)

     INSERT INTO TableB (RowNum)

     SELECT 1 UNION ALL

     SELECT 2 UNION ALL

     SELECT 3 UNION ALL

     SELECT 4

    We can create an intentional cross-join between the two tables that will return every combination of A and B using either of the two follow code snippets...

    --===== A cross-join that produces 16 combinations

     SELECT a.RowNum AS RowNum_A,b.RowNum AS RowNum_B

       FROM TableA a,

            TableB b

    --===== Same cross-join using new ANSI syntax produces 16 combinations

     SELECT a.RowNum AS RowNum_A,b.RowNum AS RowNum_B

       FROM TableA a

      CROSS JOIN

            TableB b

    The results of both cross-joins form a result set that can be depicted much as you would a simple multiplication table (with 1 axis backwards from the other) like this... (blue area identifies all returns from the cross-join)... this is why the formula for the number of internally spawned rows for a cross-join is simply X2 for equal length tables or A*B for unequal length tables...

    TableB

    4

    3

    2

    1

    TableA

    1

    1,4

    1,3

    1,2

    1,1

    2

    2,4

    2,3

    2,2

    2,1

    3

    3,4

    3,3

    3,2

    3,1

    4

    4,4

    4,3

    4,2

    4,1

    A Triangular Join is nothing more than about half of a Cross-Join.  An example using the two tables we created would be...

    --===== A triangular-join that produces 10 combinations

     SELECT a.RowNum AS RowNum_A,b.RowNum AS RowNum_B

       FROM TableA a,

            TableB b

      WHERE a.RowNum <= b.RowNum

    ...and the result set can be represented as in the blue area of the following graphic... notice that it forms a triangle and that's why these are called "Triangular Joins".  The formula for the number of interal rows spawned for this type ( <= ) of triangular join is (X2+X)/2 (the old "sum of the sequential numbers formula")...

    TableB

    4

    3

    2

    1

    TableA

    1

    1,4

    1,3

    1,2

    1,1

    2

    2,4

    2,3

    2,2

    2,1

    3

    3,4

    3,3

    3,2

    3,1

    4

    4,4

    4,3

    4,2

    4,1

    If your code snippet is sans an equality, like this...

    --===== A triangular-join that produces 6 combinations

     SELECT a.RowNum AS RowNum_A,b.RowNum AS RowNum_B

       FROM TableA a,

            TableB b

      WHERE a.RowNum < b.RowNum

    You still end up with a triangular join, but smaller.  It's represented by the formula (if tables have equal row counts) of (X2-X)/2

    TableB

    4

    3

    2

    1

    TableA

    1

    1,4

    1,3

    1,2

    1,1

    2

    2,4

    2,3

    2,2

    2,1

    3

    3,4

    3,3

    3,2

    3,1

    4

    4,4

    4,3

    4,2

    4,1

    Last, but not least, you can do the "double" triangular join using a full inequality... this is great for making schedules of teams to play teams and is represented by the following code snippet...

    --===== A triangular-join that produces 12 combinations

     SELECT a.RowNum AS RowNum_A,b.RowNum AS RowNum_B

       FROM TableA a,

            TableB b

      WHERE a.RowNum <> b.RowNum

    And, from the following graphic, you can see why it's called a "double" triangular join.  Because of the inequality, it makes good team play schedules because no team ever plays against itself... the formula to calculate rows is simply X2-X...

    TableB

    4

    3

    2

    1

    TableA

    1

    1,4

    1,3

    1,2

    1,1

    2

    2,4

    2,3

    2,2

    2,1

    3

    3,4

    3,3

    3,2

    3,1

    4

    4,4

    4,3

    4,2

    4,1

    Lot's of people use triangular joins to do things like make row numbers in their code.  The problem is the number of internal joins the code does to produce the row numbers (some call this a running count)... take the following code, for example...

    --===== Create a small demo table

     CREATE TABLE TableC (RowNum INT)

     INSERT INTO TableC (RowNum)

     SELECT 84 UNION ALL

     SELECT  5 UNION ALL

     SELECT 99 UNION ALL

     SELECT 12

    ... and, let's list all the values with a running count...

    --===== List the items with a running count

     SELECT RunCount = (SELECT COUNT(*) FROM TableC cs WHERE cs.SomeVal <= c.SomeVal),

            c.SomeVal

       FROM TableC c

      ORDER BY RunCount

    ... notice the correlated subquery (a query that makes reference outside itself) and the big red triangular join it makes with the outer query?  Although the result set that gets displayed only returns 4 rows, the number of INNER rows the query had to touch is (X2+X)/2.  In this case, it had to touch 10 rows just to get the RunCount (X=#of rows in table which is 4 for this example).  Check the ACTUAL (not estimated) execution plan... you'll see I'm right... then, it also has to touch the normal 4 rows to make the return.  So, the real formula for the number of rows touched is actually ((X2+X)/2) + X.... or 14 rows just to return 4.  Let's see what happens when we try that on a table of 10,000 rows...

    ((X2+X)/2)+X = ((10,0002+10,000)/2)+10,000 = 50,015,000 internal rows

    Ya see that big ol' number of internal rows touched?  That's right, it has to look at over 50 MILLION rows to generate a stupid little running count for 10,000 rows.  Let's try it for 20,000 rows...

    ((X2+X)/2)+X = ((20,0002+20,000)/2)+20,000 = 200,030,000 internal rows

    That's right... it's not twice as bad... it's almost 4 times worse!!!

    THAT's why I say, "Be careful, it has a triangular join in it" and "Be careful, there's a correlated subquery in that" and "You've gotta be kidding me!!!". 

    So, with some restraint and the right criteria, triangular joins can be used for some pretty remarkable things... make finite schedules... do high speed dupe checks... etc.  But, you've gotta be really careful... they can bring a CPU and Disk System right to it's knees if you're not.

    What else would you like to know about trigangular joins?  For more information, I recommend the following GOOGLE search... WHAT IS A "TRIANGULAR JOIN" SQL

    You'll find that I'm not the only one that says "Be careful when using triangular joins".

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • The physically most efficient solution is to use a temporary table with an identity column to be new sort order.

    IF OBJECT_ID('tempdb..#Sample') is not null drop table #Sample

    IF OBJECT_ID('tempdb..#OrderNew') is not null drop table #OrderNew

    go

    -- Generate test data

    select id as PK

    , id as SortOrder

    into #Sample

    from master.dbo.sysobjects

    create unique clustered index Sample_P on #Sample (PK)

    go

    -- Temporary table with new sortOrder

    create table #OrderNew

    ( PK integer not null

    , SortOrderOldinteger

    , SortOrderNewinteger IDENTITY(1,1)

    )

    BEGIN TRAN

    -- populate new sortOrder

    -- Lock the source table to prevent any changes

    insert into #OrderNew

    ( PK , SortOrderOld )

    select PK , SortOrder

    from #Sample with (TABLOCKX)

    order by SortOrder

    go

    create unique clustered index OrderNew_P on #OrderNew (PK)

    go

    update #Sample

    setSortOrder = ( #OrderNew.SortOrderNew * 2 ) -- allow gaps

    from#OrderNew

    where#OrderNew.PK = #Sample.Pk

    go

    commit

    go

    select * from #Sample

    SQL = Scarcely Qualifies as a Language

  • That's probably right in SQL Server 2000... dunno about 2005 with that nice new RowNum and PARTITION OVER statement.

    I do gotta ask though,  why on Earth are you declaring an explicit transaction on a temporary table?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • The #Sample is for demonstration purposes only but when it is a base table, ANY reordering process does not work should any other connection perform an insert, update or delete on the table.

    SQL = Scarcely Qualifies as a Language

  • Ok... let's say you do want an explicit transaction for that... where's the error check and rollback code that should go with it?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply