Row_Number doubt

  • I have a large table having millions of rows. There is one datetime column which captures the date and time a row is inserted to the table. Now, I want to copy the table's data into another server's table with same structure. Here the client wants me to migrate the data in a page by page manner (say 100 rows at a time). This table does not have any identity or unique key. I am thinking of using the row_number () function on the datetime column. But I have one doubt... say the trasaction fails somewhere in between... say at row number 115... it is written like the process will start again from 101 and not from 1. But is there any guarantee that the row_numbers will be same considering the datetime field can have duplicate values?

    Regards,

    Snigdha

  • Hi,

    If I were you, I would create the new table with identity key + the same structure of the table you're migrating.

    Regards

    IgorMi

    Igor Micev,My blog: www.igormicev.com

  • IgorMi (6/19/2012)


    Hi,

    If I were you, I would create the new table with identity key + the same structure of the table you're migrating.

    Regards

    IgorMi

    +1.

    Only creating an Identity Column in the new Table can give you that guarantee.

    Vinu Vijayan

    For better and faster solutions please check..."How to post data/code on a forum to get the best help" - Jeff Moden[/url] πŸ˜‰

  • Yes I know, but my problem is... we have a whole DB's tables to be migrated and client is reluctant to creating extra identity columns for DB size issues when i suggested that πŸ™

  • snigdhandream (6/19/2012)


    Yes I know, but my problem is... we have a whole DB's tables to be migrated and client is reluctant to creating extra identity columns for DB size issues when i suggested that πŸ™

    Aham, I understand you,

    Why not create the table suggested above as #temp table and then move all the fields from #temp except the identity key, to the destination table?

    Regards

    IgorMi

    Igor Micev,My blog: www.igormicev.com

  • snigdhandream (6/19/2012)


    Yes I know, but my problem is... we have a whole DB's tables to be migrated and client is reluctant to creating extra identity columns for DB size issues when i suggested that πŸ™

    You already stated the tables your client have, had no primary key and no unique identifiers. And they are concerned about size of an INT column for an Identity column?

    I call that an ex-client. That is just bad design, bad practice and a bad ending no matter how good you might be.

  • snigdhandream (6/19/2012)


    I have a large table having millions of rows. There is one datetime column which captures the date and time a row is inserted to the table. Now, I want to copy the table's data into another server's table with same structure. Here the client wants me to migrate the data in a page by page manner (say 100 rows at a time). This table does not have any identity or unique key. I am thinking of using the row_number () function on the datetime column. But I have one doubt... say the trasaction fails somewhere in between... say at row number 115... it is written like the process will start again from 101 and not from 1. But is there any guarantee that the row_numbers will be same considering the datetime field can have duplicate values?

    Regards,

    Snigdha

    Is there a combination of columns which can together form a unique key? Or to put it another way, are there any row-wise exact dupes in the table? If you can find a combination of columns which form a unique key, then something like the following would work:

    SELECT

    segment,

    RangeStart = MIN(n),

    RangeEnd = MAX(n)

    FROM (

    SELECT [Segment] = NTILE(5) OVER(ORDER BY n), *

    FROM ( -- sample data

    SELECT 0 AS n UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL

    SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9 UNION ALL

    SELECT 10 UNION ALL SELECT 11 UNION ALL SELECT 12 UNION ALL SELECT 13 UNION ALL SELECT 14 UNION ALL

    SELECT 15 UNION ALL SELECT 16 UNION ALL SELECT 17 UNION ALL SELECT 18 UNION ALL SELECT 19 UNION ALL

    SELECT 20 UNION ALL SELECT 21 UNION ALL SELECT 22 UNION ALL SELECT 23 UNION ALL SELECT 24 UNION ALL

    SELECT 25 UNION ALL SELECT 26 UNION ALL SELECT 27 UNION ALL SELECT 28 UNION ALL SELECT 29 UNION ALL

    SELECT 30 UNION ALL SELECT 31

    ) r (n)

    ) d

    GROUP BY segment

    β€œWrite the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Going along with the previous response, if you cannot come up with a unique constraint (single or multiple columns) then it really doesn't matter if duplicates are moved or not EDIT(in many cases, not all). If there is no way to tie back to the data uniquely, then it may be irrelevant. It sounds like you may have a natural unique key on your datetime column (assuming it is datetime and not date). Have you tried doing a distinct count on some of the columns?

    Jared
    CE - Microsoft

  • Thank you guys...

    This time I pressurized the client to consider adding unique keys to the tables those don't have one already. And I think they are considering, because they asked me to look into the DB and list out all the tables not having a unique key :P. In this case, row_number() will work πŸ™‚

    Thanks for your valuable suggestions πŸ™‚

    Regards,

    Snigdha

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply