Sorting Records Going Into New Table.

  • I am having problems ordering records to go into another table. The following is my code:

    USE

    EURUSD

    GO

    SELECT *

    INTO NEWTABLE

    FROM OLDTABLE

    ORDER BY RECORD

    After processing into another table I still have a wrong starting record and gaps in the continuous records. When I process a SELECT TOP 1000 query I am expecting to see the first record proceeding through the 1000th. record. Instead I am seeing record 8625 forward. How do I create a query to create a new table with the proper sequential ordering?

  • mcertini (5/5/2013)


    I am having problems ordering records to go into another table. The following is my code:

    USE

    EURUSD

    GO

    SELECT *

    INTO NEWTABLE

    FROM OLDTABLE

    ORDER BY RECORD

    After processing into another table I still have a wrong starting record and gaps in the continuous records. When I process a SELECT TOP 1000 query I am expecting to see the first record proceeding through the 1000th. record. Instead I am seeing record 8625 forward. How do I create a query to create a new table with the proper sequential ordering?

    This will create a new column using the special system IDENTITY function which works when doing SELECT INTO. Your rows will maintain their original keys but will now have a new sequentially numbered column.

    SELECT

    IDENTITY(INT,1,1) AS ID,

    OLD.*

    INTO NEWTABLE AS NEW

    FROM OLDTABLE AS OLD

    ORDER BY RECORD

    Second option...this will also create a sequentially ordered column based on your sort criteria which gives you more flexibility on the sorting.

    SELECT

    ROW_NUMBER() OVER (ORDER BY OLD.RECORD, OLD.COL1, OLD.COL2 [etc]) AS ID,

    OLD.*

    INTO NEWTABLE AS NEW

    FROM OLDTABLE AS OLD

    There's no guarantee that when you insert large amounts of data into a table (especially an existing table) that the rows will be in physical order. If you create a pseudo-key like this for ordering you will still need to apply an ORDER BY to be sure of getting results back in your desired order.

    If you really need the rows in physical order in the file you can drop all indexes and re-create a clustered index on the columns to be ordered. The index process will then by definition re-order the data physically based on whatever column definitions you define for the clustered index.

    If for some reason your table is using uniqueidentifiers (GUIDs) as a sort key, there is a new SequentialGUID datatype available which can be used to keep your rows in sequential order. Otherwise, trying to sort based on a GUID's value is somewhat pointless.

     

  • Steven,

    Thank you for the reply it is much appreciated.

    Mike

  • Just clarifying the post above, because we need to do this many times in our company...Isn't it true that the option above - an Identity column - only will work for ordering on small sets of data because of paging? I am not sure if this is what you are trying to say above. So, the only way to get a truly ordered data set is to use the clustered index? Will RowNumber() work on large sets also? Thanks.

    Kris Robinett

  • I have to ask why order matters for the records stored in the table. SQL Server makes no guarantee about that. The only reliable way I am aware of to present records in a desired sequence is to use an order by clause when you select the records. I would never rely on the storage order of data in a SQL Server table.

    This behavior is different in the AS400 world, where you can present data in arrival order, and influence that order with an order by in the table load process.

  • Many times we need to rely on records being next to each other to determine next Value ordered by a Date, let's say. So, if the identity column is called "Ref". We need to rely on Ref = Ref-1. It doesn't necessarily have to display in that order, but the Refs have to be in order based on the fields we determine. Here's a quick example of something we would do. The Clustered Index would be on Vendor, Item, and ChangeDate for this example.

    Update tmpCost

    set cur.NextCost = nxt.ListCost

    from tmpCost cur

    inner join tmpCost nxt

    on cur.Vendor = nxt.Vendor

    and cur.Item = nxt.Item

    and cur.Ref + 1 = nxt.Ref

    Thanks.

    Kris Robinett

  • Ross McMicken (5/6/2013)


    The only reliable way I am aware of to present records in a desired sequence is to use an order by clause when you select the records. I would never rely on the storage order of data in a SQL Server table.

    That is absolutely correct 100%. Tables by their nature have no order. They are unordered sets. The only way to ensure order is with an order by clause. Storage order means absolutely nothing.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Kricky (5/6/2013)


    Many times we need to rely on records being next to each other to determine next Value ordered by a Date, let's say. So, if the identity column is called "Ref". We need to rely on Ref = Ref-1. It doesn't necessarily have to display in that order, but the Refs have to be in order based on the fields we determine. Here's a quick example of something we would do. The Clustered Index would be on Vendor, Item, and ChangeDate for this example.

    Update tmpCost

    set cur.NextCost = nxt.ListCost

    from tmpCost cur

    inner join tmpCost nxt

    on cur.Vendor = nxt.Vendor

    and cur.Item = nxt.Item

    and cur.Ref + 1 = nxt.Ref

    Thanks.

    This code breaks down if you have ever inserted rows out of sequence of the clustered key, though. For example, if you insert 10 rows for Vendor A/Item 1/ChangeDate 1-10, 10 rows for Vendor A/Item 2/ChangeDate 1-10, 10 rows for Vendor B/Item 1/ChangeDate 1-10, then 10 more rows for Vendor A/Item 1/ChangeDate 11-20, Vendor A/Item 1 will have rows with Ref 1-10 and 31-40 (the value of an IDENTITY column is incremented each time a row is inserted and doesn't depend on the clustered key order). Your update statement will update the rows with Ref 1-9 and Ref 31-39, but not Ref 10 because there is no row that satisfies the join condition cur.Ref + 1 = nxt.Ref.

    Even with a clustered index, which imposes the logical order of the clustered key on the page chain holding the data for the table (pages in the page chain and rows on each page are ordered according to the clustered key), SQL Server can and will return rows in any order unless constrained by an ORDER BY clause. Writing DML that depends on row ordering to return the correct result without specifying an ORDER BY clause is a bad idea, period.

    Jason Wolfkill

  • If you create Clustered Index on RECORD column of new table. It sorts and stores data in the clustered index key.

  • Neeraj Dwivedi (5/6/2013)


    If you create Clustered Index on RECORD column of new table. It sorts and stores data in the clustered index key.

    That is true for storage (well most of the time). But it is absolutely not true for retrieval. If you want rows in a certain order from a select there is ONLY 1 way do ensure the order, use an order by clause.

    http://blogs.msdn.com/b/conor_cunningham_msft/archive/2008/08/27/no-seatbelt-expecting-order-without-order-by.aspx

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Neeraj Dwivedi (5/6/2013)


    If you create Clustered Index on RECORD column of new table. It sorts and stores data in the clustered index key.

    the clustered index, just because it is sorted, never, ever guarantees a specific, repeatable order of the data will be returned. it might work consistently coincidentally for a while, but a WHERE statement, other indexes, parallel processing, reindexing, row locks,and many other things can affect the ordering of the data in an unordered dataset.

    ONLY an explicit Order By [ColumnName] will return consistent results in SQL server.

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • Thank you both Lowell & Sean for pointing my mistake out, I was thinking about the storing data & haven't read the post properly.

  • Just to clarify: with ROW_NUMBER(), you don't have to ORDER BY all the columns you want to SELECT; just one ORDER BY column is fine if that's all you need. I don't see any real advantage of ROW_NUMBER() over identity in this case; seems just a matter of your own preference.

    SELECT

    ROW_NUMBER() OVER (ORDER BY OLD.[RECORD]) AS ID,

    OLD.*

    INTO NEWTABLE

    FROM OLDTABLE AS OLD

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply