Surprised that Row_Number() has better performance than IDENTITY

  • I was doing some testing to see if it's more efficient to use an identity column or to use ROW_NUMBER() to simulate a unique ID for the result set.  My thinking was that ROW_NUMBER() would need to do some type of evaluation of the result set in order to assign the value where IDENTITY would use the seed and simply increment.  Using a result set of 4 million records the ROW_NUMBER() approach was a full second faster.  This is the opposite of what I thought would happen.  Just an academic observation but I thought it was interesting.

  • Not that shocking. To use an IDENTITY column, you need to write data somewhere, whereas the ROW_NUMBER() is just a calculation.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Actually I wrote data in both instances.  An example is below

    -- Approach 1

    CREATE TABLE FOO (iid int identity, ky int)

    INSERT FOO(ky) SELECT ky FROM myTable ORDER BY ky

     

    -- Approach 2

    CREATE TABLE FOO (iid int, ky int)

    INSERT FOO(iid, ky) SELECT ROW_NUMBER() OVER(ORDER BY ky) as iid, ky FROM myTable ORDER BY ky

  • Try the test again, but use a SEQUENCE. You should see the same results. It's because of the processing needed for maintaining an identity value versus the calculation of the ROW_NUMBER.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Thanks!

    ST

  • I'm thinking that SEQUENCE is going to be about 4 times slower.  That's just a guess.

    Sounds crazy but this is fascinating to me because I work with some really large numbers of rows in the things I do and milliseconds do matter here.  I'm putting together some of my own tests that I'll share but I've got an appointment I have to go to.

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • This was removed by the editor as SPAM

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply