• Hi Lawrence, I have spent quite a bit of time recently using Row_Number() with CTEs for paging purposes and have tried to gauge how to get the best performance possible out of them.

    While researching I read that when you use a CTE in a query, it reruns the CTE for ever column in the outer select that references it:

    such that in your example query:

    ;WITH BookCTE (RowNumber, BookTitle, BookEdition, BookPublishDate, BookAuthor)

    AS

    (

    SELECT

    ROW_NUMBER()OVER (PARTITION BY BookAuthor ORDER BY BookPublishDate DESC),

    BookTitle, BookEdition, BookPublishDate, BookAuthor

    FROM dbo.Books

    )

    SELECT BookTitle, BookEdition, BookPublishDate, BookAuthor

    FROM BookCTE

    WHERE RowNumber=1

    ORDER BY BookTitle

    it would run the CTE 4 times -- once for each column in the outer select. For the example query here that is not a big deal, but if the CTE definition is complex and involves some sort of aggregate function or group by it can be a bit tricky.

    So I have experimented with writing the query as such and also including only the minimum number of columns in the CTE as required and then having the outer query join to the tables necessary for the select. Your example can't really be refined much using this method since all of the columns in the select statement are "used" by the CTE... but if we pretended that you also needed the PageCount, PublisherName, and ISBN of the book, then using this method it would read as:

    ;WITH BookCTE (RowNumber, BookTitle, BookEdition, BookPublishDate, BookAuthor)

    AS

    (

    SELECT

    ROW_NUMBER()OVER (PARTITION BY BookAuthor ORDER BY BookPublishDate DESC),

    BookTitle, BookEdition, BookPublishDate, BookAuthor

    FROM dbo.Books

    )

    SELECT C.BookTitle, C.BookEdition, C.BookPublishDate, C.BookAuthor, B.PageCount, B.PublisherName, B.ISBN

    FROM BookCTE C inner join Books B on C.BookTitle=B.BookTitle --this isn't a very good key, you'd actually want to use the pkey of the table to join

    WHERE RowNumber=1

    ORDER BY BookTitle

    Why I bring this up is, first, I wanted to verify the assertion that I read that this is indeed what occurs behind the scenes (since I don't remember where I read it) and also to comment that I have noticed empirically that sometimes the performance is better when I include all the select columns in the CTE and write a simple outer query and sometimes it is better when I do it the way I described above. In my application, the CTE definition ends up being variable most of the time (based on the input parameters from the user), so I am having to make my best guess as to which way the performance will be better in the majority of cases.

    Do you (or any of the other SQL gurus reading this) have any thoughts on the topic or best practices on how to write these queries when they get complicated, to keep performance from going way down?

    Thanks in advance,

    Anye

    --
    Anye Mercy
    "Service Unavailable is not an Error" -- John, ENOM support
    "You keep using that word. I do not think it means what you think it means." -- Inigo Montoya in "Princess Bride"
    "Civilization exists by geologic consent, subject to change without notice." -- Will Durant