Click here to monitor SSC
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Paging and Versioning Using ROW_NUMBER()


Paging and Versioning Using ROW_NUMBER()

Author
Message
Lawrence Moore
Lawrence Moore
SSC-Enthusiastic
SSC-Enthusiastic (147 reputation)SSC-Enthusiastic (147 reputation)SSC-Enthusiastic (147 reputation)SSC-Enthusiastic (147 reputation)SSC-Enthusiastic (147 reputation)SSC-Enthusiastic (147 reputation)SSC-Enthusiastic (147 reputation)SSC-Enthusiastic (147 reputation)

Group: General Forum Members
Points: 147 Visits: 400
Comments posted to this topic are about the item Paging and Versioning Using ROW_NUMBER()
info-1049562
info-1049562
SSC Rookie
SSC Rookie (33 reputation)SSC Rookie (33 reputation)SSC Rookie (33 reputation)SSC Rookie (33 reputation)SSC Rookie (33 reputation)SSC Rookie (33 reputation)SSC Rookie (33 reputation)SSC Rookie (33 reputation)

Group: General Forum Members
Points: 33 Visits: 9
Dear mr. Moore,

Perhaps I have a not so smart question but I wonder if this method
can be used on mysql databases as well for as far as I know one can't use stored procedures in a mysql database.

Hope te hear from you.

Kind regards,

Martin
Lawrence Moore
Lawrence Moore
SSC-Enthusiastic
SSC-Enthusiastic (147 reputation)SSC-Enthusiastic (147 reputation)SSC-Enthusiastic (147 reputation)SSC-Enthusiastic (147 reputation)SSC-Enthusiastic (147 reputation)SSC-Enthusiastic (147 reputation)SSC-Enthusiastic (147 reputation)SSC-Enthusiastic (147 reputation)

Group: General Forum Members
Points: 147 Visits: 400
Martin,
Unfortunately I don't know much about MySQL as my background is MS SQL Server.

However, you can definitely use stored procedures:
http://www.mysqltutorial.org/introduction-to-sql-stored-procedures.aspx

As for Row_Number() type functionality, I don't know if the partitioning capabilities are available, but I found the following link which discusses adding a rownumber field:
http://jimlife.wordpress.com/2008/09/09/displaying-row-number-rownum-in-mysql/

In conjunction with stored procedures, it should provide the same paging functionality, if this is what you were after.

Hope that helps,
Regards,
Lawrence
Paul White
Paul White
SSChampion
SSChampion (13K reputation)SSChampion (13K reputation)SSChampion (13K reputation)SSChampion (13K reputation)SSChampion (13K reputation)SSChampion (13K reputation)SSChampion (13K reputation)SSChampion (13K reputation)

Group: General Forum Members
Points: 13170 Visits: 11354
A well-presented and enjoyable article, thanks.



Paul White
SQLPerformance.com
SQLblog.com
@SQL_Kiwi
Anye Mercy
Anye Mercy
SSC-Enthusiastic
SSC-Enthusiastic (150 reputation)SSC-Enthusiastic (150 reputation)SSC-Enthusiastic (150 reputation)SSC-Enthusiastic (150 reputation)SSC-Enthusiastic (150 reputation)SSC-Enthusiastic (150 reputation)SSC-Enthusiastic (150 reputation)SSC-Enthusiastic (150 reputation)

Group: General Forum Members
Points: 150 Visits: 381
Hi Lawrence, I have spent quite a bit of time recently using Row_Number() with CTEs for paging purposes and have tried to gauge how to get the best performance possible out of them.

While researching I read that when you use a CTE in a query, it reruns the CTE for ever column in the outer select that references it:
such that in your example query:

;WITH BookCTE (RowNumber, BookTitle, BookEdition, BookPublishDate, BookAuthor)
AS
(
SELECT
ROW_NUMBER()OVER (PARTITION BY BookAuthor ORDER BY BookPublishDate DESC),
BookTitle, BookEdition, BookPublishDate, BookAuthor
FROM dbo.Books
)
SELECT BookTitle, BookEdition, BookPublishDate, BookAuthor
FROM BookCTE
WHERE RowNumber=1
ORDER BY BookTitle

it would run the CTE 4 times -- once for each column in the outer select. For the example query here that is not a big deal, but if the CTE definition is complex and involves some sort of aggregate function or group by it can be a bit tricky.

So I have experimented with writing the query as such and also including only the minimum number of columns in the CTE as required and then having the outer query join to the tables necessary for the select. Your example can't really be refined much using this method since all of the columns in the select statement are "used" by the CTE... but if we pretended that you also needed the PageCount, PublisherName, and ISBN of the book, then using this method it would read as:

;WITH BookCTE (RowNumber, BookTitle, BookEdition, BookPublishDate, BookAuthor)
AS
(
SELECT
ROW_NUMBER()OVER (PARTITION BY BookAuthor ORDER BY BookPublishDate DESC),
BookTitle, BookEdition, BookPublishDate, BookAuthor
FROM dbo.Books
)
SELECT C.BookTitle, C.BookEdition, C.BookPublishDate, C.BookAuthor, B.PageCount, B.PublisherName, B.ISBN
FROM BookCTE C inner join Books B on C.BookTitle=B.BookTitle --this isn't a very good key, you'd actually want to use the pkey of the table to join
WHERE RowNumber=1
ORDER BY BookTitle

Why I bring this up is, first, I wanted to verify the assertion that I read that this is indeed what occurs behind the scenes (since I don't remember where I read it) and also to comment that I have noticed empirically that sometimes the performance is better when I include all the select columns in the CTE and write a simple outer query and sometimes it is better when I do it the way I described above. In my application, the CTE definition ends up being variable most of the time (based on the input parameters from the user), so I am having to make my best guess as to which way the performance will be better in the majority of cases.

Do you (or any of the other SQL gurus reading this) have any thoughts on the topic or best practices on how to write these queries when they get complicated, to keep performance from going way down?

Thanks in advance,
Anye

--
Anye Mercy
"Service Unavailable is not an Error" -- John, ENOM support
"You keep using that word. I do not think it means what you think it means." -- Inigo Montoya in "Princess Bride"
"Civilization exists by geologic consent, subject to change without notice." -- Will Durant
Lawrence Moore
Lawrence Moore
SSC-Enthusiastic
SSC-Enthusiastic (147 reputation)SSC-Enthusiastic (147 reputation)SSC-Enthusiastic (147 reputation)SSC-Enthusiastic (147 reputation)SSC-Enthusiastic (147 reputation)SSC-Enthusiastic (147 reputation)SSC-Enthusiastic (147 reputation)SSC-Enthusiastic (147 reputation)

Group: General Forum Members
Points: 147 Visits: 400
Anye,

Many thanks for your post.

It's news to me that the performance of a CTE query is based on the number of columns in the "Outer" query.

Rather, CTEs are generally very efficient as the processing is done with one pass of the data.

Certainly, a very quick investigation using SET STATISTICS IO does not raise any concerns.

For example, running:
SET STATISTICS IO ON

;WITH BookCTE (RowNumber, BookTitle, BookEdition, BookPublishDate, BookAuthor)
AS
(
SELECT
ROW_NUMBER()OVER (PARTITION BY BookTitle ORDER BY BookEdition DESC),
BookTitle, BookEdition, BookPublishDate, BookAuthor
FROM dbo.Books
)
SELECT BookTitle, BookEdition, BookPublishDate, BookAuthor
FROM BookCTE
WHERE RowNumber=1
ORDER BY BookTitle


...gives the following output:
Table 'Books'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Running the same query but only returning a single column (e.g. BookTitle) yields the same IO results.

See if you can find the reference that stated this behaviour, or otherwise provide a setup that would show this to be the case. Perhaps it occurs with very large tables where the processing cannot be done fully in memory(?)

I'd obviously welcome other experts' views on this point.

Best regards,
Lawrence.
rramirez
rramirez
Forum Newbie
Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)

Group: General Forum Members
Points: 1 Visits: 11
This article is a good example of how technical information can be written in a clear language.

Thanks for sharing this useful technique!!!

Roberto R.
Anye Mercy
Anye Mercy
SSC-Enthusiastic
SSC-Enthusiastic (150 reputation)SSC-Enthusiastic (150 reputation)SSC-Enthusiastic (150 reputation)SSC-Enthusiastic (150 reputation)SSC-Enthusiastic (150 reputation)SSC-Enthusiastic (150 reputation)SSC-Enthusiastic (150 reputation)SSC-Enthusiastic (150 reputation)

Group: General Forum Members
Points: 150 Visits: 381
Ok, I dug it up (or dug up something else that was related) and I apparently misread it -- the article I read says that the CTE is executed the number of times the CTE itself is referenced (i.e. # of joins) x the number of rows from the anchor.

http://sqlblogcasts.com/blogs/tonyrogerson/archive/2008/05/17/non-recursive-common-table-expressions-performance-sucks-1-cte-self-join-cte-sub-query-inline-expansion.aspx

http://sqlblogcasts.com/blogs/tonyrogerson/archive/2008/05/17/non-recursive-common-table-expressions-performance-sucks-2-row-number-is-executed-number-of-cte-references-x-number-of-rows-from-the-anchor.aspx

His examples JOIN to the CTE multiple times and therein lies the performance hit, as opposed to # of columns from the CTE as I had previously believed.

However, it still leaves me curious as to why sometimes using a simpler CTE with more joins in the outer query performs better than the other way around. I will play around with this with statistics on with some of my "hairier" queries and see what it comes back with.

--
Anye Mercy
"Service Unavailable is not an Error" -- John, ENOM support
"You keep using that word. I do not think it means what you think it means." -- Inigo Montoya in "Princess Bride"
"Civilization exists by geologic consent, subject to change without notice." -- Will Durant
Lawrence Moore
Lawrence Moore
SSC-Enthusiastic
SSC-Enthusiastic (147 reputation)SSC-Enthusiastic (147 reputation)SSC-Enthusiastic (147 reputation)SSC-Enthusiastic (147 reputation)SSC-Enthusiastic (147 reputation)SSC-Enthusiastic (147 reputation)SSC-Enthusiastic (147 reputation)SSC-Enthusiastic (147 reputation)

Group: General Forum Members
Points: 147 Visits: 400
Anye,

Thanks for posting a follow up.

It sounds like any useful findings you gather could form the basis for an interesting CTE article. ;-)

With regards,
Lawrence.
grc-80104
grc-80104
Grasshopper
Grasshopper (19 reputation)Grasshopper (19 reputation)Grasshopper (19 reputation)Grasshopper (19 reputation)Grasshopper (19 reputation)Grasshopper (19 reputation)Grasshopper (19 reputation)Grasshopper (19 reputation)

Group: General Forum Members
Points: 19 Visits: 109
Great article. I've used ROW_NUMBER() in CTE to evaluate (compare) data in the previous rows or next rows. It was a real life saver. :-)
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search