Surprised that Row_Number() has better performance than IDENTITY

  • souLTower

    SSCarpal Tunnel

    Points: 4026

    I was doing some testing to see if it's more efficient to use an identity column or to use ROW_NUMBER() to simulate a unique ID for the result set.  My thinking was that ROW_NUMBER() would need to do some type of evaluation of the result set in order to assign the value where IDENTITY would use the seed and simply increment.  Using a result set of 4 million records the ROW_NUMBER() approach was a full second faster.  This is the opposite of what I thought would happen.  Just an academic observation but I thought it was interesting.

  • Grant Fritchey

    SSC Guru

    Points: 396288

    Not that shocking. To use an IDENTITY column, you need to write data somewhere, whereas the ROW_NUMBER() is just a calculation.

    ----------------------------------------------------
    The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood...
    Theodore Roosevelt

    The Scary DBA
    Author of: SQL Server 2017 Query Performance Tuning, 5th Edition and SQL Server Execution Plans, 3rd Edition
    Product Evangelist for Red Gate Software

  • souLTower

    SSCarpal Tunnel

    Points: 4026

    Actually I wrote data in both instances.  An example is below

    -- Approach 1

    CREATE TABLE FOO (iid int identity, ky int)

    INSERT FOO(ky) SELECT ky FROM myTable ORDER BY ky

     

    -- Approach 2

    CREATE TABLE FOO (iid int, ky int)

    INSERT FOO(iid, ky) SELECT ROW_NUMBER() OVER(ORDER BY ky) as iid, ky FROM myTable ORDER BY ky

  • Grant Fritchey

    SSC Guru

    Points: 396288

    Try the test again, but use a SEQUENCE. You should see the same results. It's because of the processing needed for maintaining an identity value versus the calculation of the ROW_NUMBER.

    ----------------------------------------------------
    The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood...
    Theodore Roosevelt

    The Scary DBA
    Author of: SQL Server 2017 Query Performance Tuning, 5th Edition and SQL Server Execution Plans, 3rd Edition
    Product Evangelist for Red Gate Software

  • souLTower

    SSCarpal Tunnel

    Points: 4026

    Thanks!

    ST

  • Jeff Moden

    SSC Guru

    Points: 995650

    I'm thinking that SEQUENCE is going to be about 4 times slower.  That's just a guess.

    Sounds crazy but this is fascinating to me because I work with some really large numbers of rows in the things I do and milliseconds do matter here.  I'm putting together some of my own tests that I'll share but I've got an appointment I have to go to.

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
    "If you think its expensive to hire a professional to do the job, wait until you hire an amateur."--Red Adair
    "Change is inevitable... change for the better is not."

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 6 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply