• Peter M. (11/12/2008)


    I'd be more interested in seeing the execution times where the data size is non-trivial. My suspicion would lead me to believe the row_number() solution might scale better based on my own experience using the related dense_rank() function.

    You could even use two dense rank functions that are partitioned identically but have an order by that is based on the version number in one and the date field in the other, then using that to compare when the resulting ranks are not equal. It would require a derived table joined to the original table, but you could get the results you are looking for.

    and thanks, Jeff Moden, for pointing out the use of the clustered index on PersonId, Version.

    I was interested to see how the different solutions would scale, so I expanded the PersonRecord table to 1 million PersonIDs, and expanded the rows of data to 5 records per PersonId, using Jeff's code. And I added the clustered index.

    And I added dfarran's query.

    Here are the results:

    Original Query 153,733 Milliseconds duration

    James Goodwin -

    Left join on inner join 136,703 Milliseconds duration

    Matt C -

    SQL2005 row_number()

    over (partition 184,280 Milliseconds duration

    Adam Haines -

    inner joins and CASE 138,906 Milliseconds duration

    dfarrans -

    just left joins,

    no version-1 245,016 Milliseconds duration

    Granted, these are total execution times, so include I/O.

    Also, I did not alter the queries that used version-1, which is a weakness if the version sequence is not perfect.