﻿<?xml version='1.0' encoding='UTF-8'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/"><channel><title>SQLServerCentral / Article Discussions / Article Discussions by Author / Discuss content posted by Lawrence Moore  / Paging and Versioning Using ROW_NUMBER() / Latest Posts</title><generator>InstantForum.NET v2.9.0</generator><description>SQLServerCentral</description><link>http://www.sqlservercentral.com/Forums/</link><webMaster>notifications@sqlservercentral.com</webMaster><lastBuildDate>Sun, 19 May 2013 19:28:54 GMT</lastBuildDate><ttl>20</ttl><item><title>RE: Paging and Versioning Using ROW_NUMBER()</title><link>http://www.sqlservercentral.com/Forums/Topic937225-2718-1.aspx</link><description>some solutions become simple and convenient since RANK function(NOW_NUMBER,RANK and DENSE_RANK).</description><pubDate>Mon, 21 Jun 2010 20:19:51 GMT</pubDate><dc:creator>changbluesky</dc:creator></item><item><title>RE: Paging and Versioning Using ROW_NUMBER()</title><link>http://www.sqlservercentral.com/Forums/Topic937225-2718-1.aspx</link><description>Good one.. :)</description><pubDate>Thu, 17 Jun 2010 08:00:08 GMT</pubDate><dc:creator>sandeep.cs3</dc:creator></item><item><title>RE: Paging and Versioning Using ROW_NUMBER()</title><link>http://www.sqlservercentral.com/Forums/Topic937225-2718-1.aspx</link><description>Jesus,I'm not completely sure what you are asking.Could you elaborate please?Many thanks,Lawrence</description><pubDate>Wed, 16 Jun 2010 14:09:08 GMT</pubDate><dc:creator>Lawrence Moore</dc:creator></item><item><title>RE: Paging and Versioning Using ROW_NUMBER()</title><link>http://www.sqlservercentral.com/Forums/Topic937225-2718-1.aspx</link><description>Mr. Lawrence Moore,Then according to this algorithm, Is will be possible to predict how many pages will be in the Query?</description><pubDate>Wed, 16 Jun 2010 13:30:07 GMT</pubDate><dc:creator>jesus salazar</dc:creator></item><item><title>RE: Paging and Versioning Using ROW_NUMBER()</title><link>http://www.sqlservercentral.com/Forums/Topic937225-2718-1.aspx</link><description>JJ B,Many thanks for your post and kind comments.It's certainly been a valuable process for me to discover the performance limitations of ROW_NUMBER(), and be able to demonstrate that they can be overcome with appropriate indexing strategies.However, it's clear that any ROW_NUMBER() based solution across large amounts of data does have some gotchas. Should the "ideal" index(es) not be available, or can't be created due to environmental constraints, other solutions are faster.As always, it's about understanding your data and the queries that are run across it.Kind regards,Lawrence</description><pubDate>Wed, 16 Jun 2010 11:03:46 GMT</pubDate><dc:creator>Lawrence Moore</dc:creator></item><item><title>RE: Paging and Versioning Using ROW_NUMBER()</title><link>http://www.sqlservercentral.com/Forums/Topic937225-2718-1.aspx</link><description>Lawrence:  I wanted to say that I appreciate both the original article and all your follow up work.  I had read elsewhere about performance problems with the functions like row_number, but I didn't know what could be done to help alleviate the problem.  Row_number is just too darn useful to want to give up.  Nice that there is something that can be done to help with performance.  Thanks.</description><pubDate>Wed, 16 Jun 2010 08:45:52 GMT</pubDate><dc:creator>JJ B</dc:creator></item><item><title>RE: Paging and Versioning Using ROW_NUMBER()</title><link>http://www.sqlservercentral.com/Forums/Topic937225-2718-1.aspx</link><description>All,With Henrik's input, I've been able to deduce that the following Index improves the "Latest Edition" query considerably:CREATE UNIQUE INDEX ix_Books_Test1 ON dbo.Books(BookTitle, BookEdition DESC) INCLUDE (BookPublishDate, BookAuthor)This is an ideal index created to serve this query only, and may not help others.Regards,Lawrence.</description><pubDate>Wed, 16 Jun 2010 06:47:47 GMT</pubDate><dc:creator>Lawrence Moore</dc:creator></item><item><title>RE: Paging and Versioning Using ROW_NUMBER()</title><link>http://www.sqlservercentral.com/Forums/Topic937225-2718-1.aspx</link><description>All,Further to my last post: it's worth pointing out that following Blah's example, I have increased the number of rows in the dbo.Books table considerably to test this.I used the following code snippet:[b]--Add additional data (run multiple times to get 500000 rows)INSERT dbo.Books( BookTitle ,BookEdition ,BookPublishDate ,BookAuthor)SELECT BookTitle, BookEdition, BookPublishDate, BookAuthorFROM dbo.BooksGO 16--Set BookEdition to BookId for better data profileUPDATE dbo.Books SET BookEdition=BookId[/b]NB: For the UPDATE to work, the BookEdition column datatype must be increased from the original SMALLINT datatype to BIGINT for example.Thanks,Lawrence</description><pubDate>Wed, 16 Jun 2010 06:27:45 GMT</pubDate><dc:creator>Lawrence Moore</dc:creator></item><item><title>RE: Paging and Versioning Using ROW_NUMBER()</title><link>http://www.sqlservercentral.com/Forums/Topic937225-2718-1.aspx</link><description>Henrik,Many thanks for your response.I've played around with indexing strategies with my "Latest Edition" example using the dbo.Books table, and nothing is really helping.The strange thing is that from an IO point of view, the ROW_NUMBER() solution does show a smaller scan and logical read count. However, the CPU time is very high.For example:[b]SET STATISTICS IO ONSET STATISTICS TIME ON;WITH BookCTE (RowNumber, BookTitle, BookEdition, BookPublishDate, BookAuthor)  AS  (        SELECT        ROW_NUMBER()OVER (PARTITION BY BookTitle ORDER BY BookEdition DESC),        BookTitle, BookEdition, BookPublishDate, BookAuthor        FROM dbo.Books  ) SELECT BookTitle, BookEdition, BookPublishDate, BookAuthor FROM BookCTE WHERE RowNumber=1 ORDER BY BookTitleSQL Server parse and compile time:    CPU time = 0 ms, elapsed time = 28 ms.(10 row(s) affected)Table 'Books'. Scan count 1, logical reads 10132, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.SQL Server Execution Times:   CPU time = 18297 ms,  elapsed time = 18295 ms.select b.BookTitle, b.BookEdition, b.BookPublishDate, b.BookAuthor from books binner join(select booktitle, max(bookedition) as bookedition from books group by booktitle)qon b.booktitle = q.booktitle and b.bookedition = q.bookedition;SQL Server parse and compile time:    CPU time = 0 ms, elapsed time = 1 ms.(10 row(s) affected)Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.Table 'Books'. Scan count 2, logical reads 20264, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.SQL Server Execution Times:   CPU time = 3609 ms,  elapsed time = 3741 ms.[/b]I think I may need Itzik to take a look at this. ;-)Will carry on with research, as time allows.Regards,Lawrence.</description><pubDate>Wed, 16 Jun 2010 06:24:13 GMT</pubDate><dc:creator>Lawrence Moore</dc:creator></item><item><title>RE: Paging and Versioning Using ROW_NUMBER()</title><link>http://www.sqlservercentral.com/Forums/Topic937225-2718-1.aspx</link><description>Lawrence,Itzik Ben-Gan has an article in the latest edition (June 2010) of SQL Server Maganzine that is about indexing for Row_number/Over/Partition bySo he thinks that “indexing guidelines are to have the ranking ordering columns in the index key list, either in specified order or exactly reversed, plus include the rest of the columns from the query in the INCLUDE clause”So you should probably create an index on hfv_hedge_rel_id, reporting_dateHIHHenrik Staun Poulsenwww.stovi.com</description><pubDate>Wed, 16 Jun 2010 04:59:01 GMT</pubDate><dc:creator>henrik staun poulsen</dc:creator></item><item><title>RE: Paging and Versioning Using ROW_NUMBER()</title><link>http://www.sqlservercentral.com/Forums/Topic937225-2718-1.aspx</link><description>Lawrence,Thank you for your answer. I will have a look at the links you gave me.Kind regards,Martin</description><pubDate>Tue, 15 Jun 2010 15:18:56 GMT</pubDate><dc:creator>info-1049562</dc:creator></item><item><title>RE: Paging and Versioning Using ROW_NUMBER()</title><link>http://www.sqlservercentral.com/Forums/Topic937225-2718-1.aspx</link><description>Thanks for your post Blah Baby.It seems that the Row_Number() clause is the cause of the slow running, rather than the CTE itself.(Worrying given the subject of my article was the former, and not the latter. :-(   )For example, the following is fast:[b];WITH BookCTE (BookTitle, BookEdition)AS  (        select booktitle, max(bookedition) as bookedition from books group by booktitle )  SELECT b.BookTitle, b.BookEdition, b.BookPublishDate, b.BookAuthor  FROM dbo.Books b INNER JOIN BookCTE c ON b.BookTitle=c.BookTitle AND b.BookEdition=c.BookEdition[/b] The issue of why ROW_NUMBER() is signifcantly slower as data volumes increase in this case, is counter-intuitive to me, and needs additional research.Thanks again.Lawrence.</description><pubDate>Tue, 15 Jun 2010 10:41:42 GMT</pubDate><dc:creator>Lawrence Moore</dc:creator></item><item><title>RE: Paging and Versioning Using ROW_NUMBER()</title><link>http://www.sqlservercentral.com/Forums/Topic937225-2718-1.aspx</link><description>You can avoid CTE by doing this to incorporate the row_number function in the where clauseselect * from (select ROW_NUMBER()OVER (ORDER BY column_order_by) as RowNbr,col_1,col_2from inner_table with(nolock)) as outer_tablewhere RowNbr &amp;lt; 5 --&amp;gt; use row number in the where clause</description><pubDate>Tue, 15 Jun 2010 10:37:30 GMT</pubDate><dc:creator>dilip_baboo</dc:creator></item><item><title>RE: Paging and Versioning Using ROW_NUMBER()</title><link>http://www.sqlservercentral.com/Forums/Topic937225-2718-1.aspx</link><description>CTEs are generally poor performing. We have had to rewrite several for performance reasons. Your example breaks down when you have a table with a lot of rows and takes over an order of magnitude longer to execute. You should not be doing any performance evaluation a such a small table. The CTE is the worst performing of the 3 methods I know of. You do not mention the third which is joining a subquery back to the table.select b.BookTitle, b.BookEdition, b.BookPublishDate, b.BookAuthor 	from	books b				inner join			(select booktitle, max(bookedition) as bookedition from books group by booktitle)q			on b.booktitle = q.booktitle and b.bookedition = q.bookedition;I reinserted your data back into the table 17 times and updated the edition with the identity column:Derived table:Table 'Books'. Scan count 17, logical reads 15461390 MillisecondsCorrelated subquery:Table 'Books'. Scan count 17, logical reads 15461400 MillisecondsCTE:Table 'Books'. Scan count 17, logical reads 212038103 Milliseconds!!!!!!!!!!</description><pubDate>Tue, 15 Jun 2010 10:03:15 GMT</pubDate><dc:creator>rock on dude</dc:creator></item><item><title>RE: Paging and Versioning Using ROW_NUMBER()</title><link>http://www.sqlservercentral.com/Forums/Topic937225-2718-1.aspx</link><description>regarding CTE's; very powerful tool for segmentation, but you do have to be careful as the engine will run the CTE query each time it's referenced. I would think, but haven't explicitly observed, that the db engine would somehow optimize this behavior, since all the reads are happening within the same transaction space it ought to be quite impossible to get a different result when running a CTE for multiple inclusion.I've written some monsterous inline table valued functions that string 5 to 8 CTE's together, recalling the earlier CTE's in later segments; feeding prequalified data to work queries and paging CTEs and so on, only to find the query plan and overhead become monsterous. The showplan doesn't seem to indicate that there are any savings for multiple references to a CTE; question for Microsoft I suppose. I eventually stepped back from the giant CTE approach and started using table valued variables; it's a different type of IO problem, but seemed to be a more effecient solution vs the monster CTE query.When it comes to CTEs and the finding the best balance between convenience and performance you really have to try variations and compare results in the execution plan; as with so many sql server optimization topics; 'it depends'.[hr]now, on to row numbering and paging; wanted to chip in my 2 cents worth.One trick I've used to get a very tight grip on paging is to introduce a second ROW_NUMBER going in the opposite direction as the first, then summing the two to get a row count before page selection; it does introduce some additional overhead and it can be significant, but it the benefit outweighs the cost it can be quite useful. The version below uses a 0 based row and page indexes; first page is 0, first row is 0.Note: removing the RN row number will significantly reduce overhead while continuing to allow you to use the page index functionality; you loose the row count and page count, but can still pull back a specific page in the sequence, accomplishing something like the LIMIT function mySql.[font="Courier New"][code="sql"]DECLARE @PageLimit INT, @PageIndex INT;SELECT @PageLimit=20, @PageIndex=0;SELECT     [RowIndex]=[IX]    ,[RowCount]=[IX] + [RN]     ,[PageCount]=CIELING(1.0 * [IX] + [RN] / @PageLimit)    ,[PageIndex]=FLOOR(1.0 * [IX] / @PageLimit)    ...FROM (        SELECT             [IX]=ROW_NUMBER() OVER(ORDER BY [foo] ASC)-1            ,[RN]=ROW_NUMBER() OVER(ORDER BY [foo] DESC)            ...    ) PageBaseWHERE FLOOR(1.0 * [IX] / @PageLimit) = @PageIndex;[/code][/font]</description><pubDate>Tue, 15 Jun 2010 09:10:32 GMT</pubDate><dc:creator>PakRat</dc:creator></item><item><title>RE: Paging and Versioning Using ROW_NUMBER()</title><link>http://www.sqlservercentral.com/Forums/Topic937225-2718-1.aspx</link><description>Great article.  I've used ROW_NUMBER() in CTE to evaluate (compare) data in the previous rows or next rows.  It was a real life saver. :-)</description><pubDate>Tue, 15 Jun 2010 08:46:20 GMT</pubDate><dc:creator>grc-80104</dc:creator></item><item><title>RE: Paging and Versioning Using ROW_NUMBER()</title><link>http://www.sqlservercentral.com/Forums/Topic937225-2718-1.aspx</link><description>Anye,Thanks for posting a follow up.It sounds like any useful findings you gather could form the basis for an interesting CTE article. ;-)With regards,Lawrence.</description><pubDate>Tue, 15 Jun 2010 08:12:33 GMT</pubDate><dc:creator>Lawrence Moore</dc:creator></item><item><title>RE: Paging and Versioning Using ROW_NUMBER()</title><link>http://www.sqlservercentral.com/Forums/Topic937225-2718-1.aspx</link><description>Ok, I dug it up (or dug up something else that was related) and I apparently misread it -- the article I read says that the CTE is executed the number of times the CTE itself is referenced (i.e. # of joins) x the number of rows from the anchor.http://sqlblogcasts.com/blogs/tonyrogerson/archive/2008/05/17/non-recursive-common-table-expressions-performance-sucks-1-cte-self-join-cte-sub-query-inline-expansion.aspxhttp://sqlblogcasts.com/blogs/tonyrogerson/archive/2008/05/17/non-recursive-common-table-expressions-performance-sucks-2-row-number-is-executed-number-of-cte-references-x-number-of-rows-from-the-anchor.aspxHis examples JOIN to the CTE multiple times and therein lies the performance hit, as opposed to # of columns from the CTE as I had previously believed.However, it still leaves me curious as to why sometimes using a simpler CTE with more joins in the outer query performs better than the other way around.  I will play around with this with statistics on with some of my "hairier" queries and see what it comes back with.</description><pubDate>Tue, 15 Jun 2010 08:07:51 GMT</pubDate><dc:creator>Anye Mercy</dc:creator></item><item><title>RE: Paging and Versioning Using ROW_NUMBER()</title><link>http://www.sqlservercentral.com/Forums/Topic937225-2718-1.aspx</link><description>This article is a good example of how technical information can be written in a clear language.Thanks for sharing this useful technique!!!Roberto R.</description><pubDate>Tue, 15 Jun 2010 08:01:11 GMT</pubDate><dc:creator>rramirez</dc:creator></item><item><title>RE: Paging and Versioning Using ROW_NUMBER()</title><link>http://www.sqlservercentral.com/Forums/Topic937225-2718-1.aspx</link><description>Anye,Many thanks for your post.It's news to me that the performance of a CTE query is based on the number of columns in the "Outer" query.Rather, CTEs are generally very efficient as the processing is done with one pass of the data.Certainly, a very quick investigation using SET STATISTICS IO does not raise any concerns.For example, running:[b]SET STATISTICS IO ON;WITH BookCTE (RowNumber, BookTitle, BookEdition, BookPublishDate, BookAuthor) AS  (        SELECT        ROW_NUMBER()OVER (PARTITION BY BookTitle ORDER BY BookEdition DESC),        BookTitle, BookEdition, BookPublishDate, BookAuthor        FROM dbo.Books  )  SELECT BookTitle, BookEdition, BookPublishDate, BookAuthor  FROM BookCTE  WHERE RowNumber=1  ORDER BY BookTitle[/b]...gives the following output:[b]Table 'Books'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.[/b]Running the same query but only returning a single column (e.g. BookTitle) yields the same IO results.See if you can find the reference that stated this behaviour, or otherwise provide a setup that would show this to be the case. Perhaps it occurs with very large tables where the processing cannot be done fully in memory(?)I'd obviously welcome other experts' views on this point.Best regards,Lawrence.</description><pubDate>Tue, 15 Jun 2010 07:11:16 GMT</pubDate><dc:creator>Lawrence Moore</dc:creator></item><item><title>RE: Paging and Versioning Using ROW_NUMBER()</title><link>http://www.sqlservercentral.com/Forums/Topic937225-2718-1.aspx</link><description>Hi Lawrence, I have spent quite a bit of time recently using Row_Number() with CTEs for paging purposes and have tried to gauge how to get the best performance possible out of them.While researching I read that when you use a CTE in a query, it reruns the CTE for ever column in the outer select that references it:such that in your example query: ;WITH BookCTE (RowNumber, BookTitle, BookEdition, BookPublishDate, BookAuthor) AS (       SELECT       ROW_NUMBER()OVER (PARTITION BY BookAuthor ORDER BY BookPublishDate DESC),       BookTitle, BookEdition, BookPublishDate, BookAuthor       FROM dbo.Books ) SELECT BookTitle, BookEdition, BookPublishDate, BookAuthor FROM BookCTE WHERE RowNumber=1 ORDER BY BookTitleit would run the CTE 4 times -- once for each column in the outer select.  For the example query here that is not a big deal, but if the CTE definition is complex and involves some sort of aggregate function or group by it can be a bit tricky.So I have experimented with writing the query as such and also including only the minimum number of columns in the CTE as required and then having the outer query join to the tables necessary for the select.  Your example can't really be refined much using this method since all of the columns in the select statement are "used" by the CTE... but if we pretended that you also needed the PageCount, PublisherName, and ISBN of the book, then using this method it would read as: ;WITH BookCTE (RowNumber, BookTitle, BookEdition, BookPublishDate, BookAuthor) AS (       SELECT       ROW_NUMBER()OVER (PARTITION BY BookAuthor ORDER BY BookPublishDate DESC),       BookTitle, BookEdition, BookPublishDate, BookAuthor       FROM dbo.Books ) SELECT C.BookTitle, C.BookEdition, C.BookPublishDate, C.BookAuthor, B.PageCount, B.PublisherName, B.ISBN FROM BookCTE C inner join Books B on C.BookTitle=B.BookTitle   --this isn't a very good key, you'd actually want to use the pkey of the table to join WHERE RowNumber=1 ORDER BY BookTitleWhy I bring this up is, first, I wanted to verify the assertion that I read that this is indeed what occurs behind the scenes (since I don't remember where I read it) and also to comment that I have noticed empirically that sometimes the performance is better when I include all the select columns in the CTE and write a simple outer query and sometimes it is better when I do it the way I described above.  In my application, the CTE definition ends up being variable most of the time (based on the input parameters from the user), so I am having to make my best guess as to which way the performance will be better in the majority of cases.Do you (or any of the other SQL gurus reading this) have any thoughts on the topic or best practices on how to write these queries when they get complicated, to keep performance from going way down?Thanks in advance,Anye</description><pubDate>Tue, 15 Jun 2010 06:57:13 GMT</pubDate><dc:creator>Anye Mercy</dc:creator></item><item><title>RE: Paging and Versioning Using ROW_NUMBER()</title><link>http://www.sqlservercentral.com/Forums/Topic937225-2718-1.aspx</link><description>A well-presented and enjoyable article, thanks.</description><pubDate>Tue, 15 Jun 2010 04:53:50 GMT</pubDate><dc:creator>Paul White</dc:creator></item><item><title>RE: Paging and Versioning Using ROW_NUMBER()</title><link>http://www.sqlservercentral.com/Forums/Topic937225-2718-1.aspx</link><description>Martin,Unfortunately I don't know much about MySQL as my background is MS SQL Server.However, you can definitely use stored procedures:http://www.mysqltutorial.org/introduction-to-sql-stored-procedures.aspxAs for Row_Number() type functionality, I don't know if the partitioning capabilities are available, but I found the following link which discusses adding a rownumber field:http://jimlife.wordpress.com/2008/09/09/displaying-row-number-rownum-in-mysql/In conjunction with stored procedures, it should provide the same paging functionality, if this is what you were after.Hope that helps,Regards,Lawrence</description><pubDate>Tue, 15 Jun 2010 02:42:28 GMT</pubDate><dc:creator>Lawrence Moore</dc:creator></item><item><title>RE: Paging and Versioning Using ROW_NUMBER()</title><link>http://www.sqlservercentral.com/Forums/Topic937225-2718-1.aspx</link><description>Dear mr. Moore,Perhaps I have a not so smart question but I wonder if this methodcan be used on mysql databases as well for as far as I know one can't use stored procedures in a mysql database.Hope te hear from you.Kind regards,Martin</description><pubDate>Mon, 14 Jun 2010 23:45:53 GMT</pubDate><dc:creator>info-1049562</dc:creator></item><item><title>Paging and Versioning Using ROW_NUMBER()</title><link>http://www.sqlservercentral.com/Forums/Topic937225-2718-1.aspx</link><description>Comments posted to this topic are about the item [B]&lt;A HREF="/articles/T-SQL/70264/"&gt;Paging and Versioning Using ROW_NUMBER()&lt;/A&gt;[/B]</description><pubDate>Mon, 14 Jun 2010 20:46:43 GMT</pubDate><dc:creator>Lawrence Moore</dc:creator></item></channel></rss>