SQL Server 2005 Paging – The Holy Grail

  • C. Westra (3/12/2009)


    I'm afraid differences in resources are partly induced by Sql Server caching (intermediate) results between the first and latter queries.

    If you use DBCC DROPCLEANBUFFERS between the queries, caches are emptied and differences in time and physical reads may better reflect actual use of resources.

    Heh... or do what I did... just run the code more than once so you can see that even in the face of caching, some solutions just shouldn't be used. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (3/12/2009)


    C. Westra (3/12/2009)


    I'm afraid differences in resources are partly induced by Sql Server caching (intermediate) results between the first and latter queries.

    If you use DBCC DROPCLEANBUFFERS between the queries, caches are emptied and differences in time and physical reads may better reflect actual use of resources.

    Heh... or do what I did... just run the code more than once so you can see that even in the face of caching, some solutions just shouldn't be used. 😉

    ... and, just to drive the point home, here's the code with both cache clearing commands in it just before each test run AND I put the "Holy Grail" code as the last code to give it the absolute best chance AND I listed the results from the 1st run after I rebuild the table. Caching has nothing to do with how bad the "Holy Grail" method takes a beating...

    --===== Define the starting row and page size

    DECLARE @StartRow INT ; SET @StartRow = 900000

    DECLARE @PageSize INT ; SET @PageSize = 50

    PRINT '--============================================================================='

    PRINT '-- The "No RBAR/No Join" method'

    PRINT '--============================================================================='

    --===== Clear the guns

    DBCC FREEPROCCACHE

    DBCC DROPCLEANBUFFERS

    --===== Turn on the timers

    SET STATISTICS IO ON

    SET STATISTICS TIME ON

    --===== The "No RBAR/No Join" method

    ;WITH

    cteCols AS

    (

    SELECT NULL AS SomeInt, NULL AS SomeLetters2, 0 AS Seq, Rows AS TotRows

    FROM sys.Partitions

    WHERE Object_ID = OBJECT_ID('dbo.JBMTest')

    AND Index_ID = 1

    UNION ALL --------------------------------------------------------------------

    SELECT SomeInt, SomeLetters2,

    ROW_NUMBER() OVER(ORDER BY SomeInt, SomeLetters2) AS Seq,

    NULL AS TotRows

    FROM dbo.JBMTest

    )

    SELECT Seq, SomeInt, SomeLetters2, TotRows

    FROM cteCols

    WHERE Seq BETWEEN @StartRow AND @StartRow + @PageSize - 1

    OR Seq = 0

    ORDER BY Seq

    --===== Turn off the timers

    SET STATISTICS TIME OFF

    SET STATISTICS IO OFF

    PRINT '--============================================================================='

    PRINT '-- A different No Join method'

    PRINT '--============================================================================='

    --===== Clear the guns

    DBCC FREEPROCCACHE

    DBCC DROPCLEANBUFFERS

    --===== Turn on the timers

    SET STATISTICS IO ON

    SET STATISTICS TIME ON

    --===== A different No Join method

    ;WITH

    cteCols AS

    (

    SELECT SomeInt, SomeLetters2,

    ROW_NUMBER() OVER(ORDER BY SomeInt, SomeLetters2) AS Seq,

    NULL AS TotRows

    FROM dbo.JBMTest

    )

    SELECT Seq, SomeInt, SomeLetters2, (SELECT Rows

    FROM sys.Partitions

    WHERE Object_ID = OBJECT_ID('dbo.JBMTest')

    AND Index_ID = 1) AS TotRows

    FROM cteCols

    WHERE Seq BETWEEN @StartRow AND @StartRow + @PageSize - 1

    OR Seq = 0

    ORDER BY Seq

    --===== Turn off the timers

    SET STATISTICS TIME OFF

    SET STATISTICS IO OFF

    PRINT '--============================================================================='

    PRINT '-- Peso''s Embedded "2 Bite" method'

    PRINT '--============================================================================='

    --===== Clear the guns

    DBCC FREEPROCCACHE

    DBCC DROPCLEANBUFFERS

    --===== Turn on the timers

    SET STATISTICS IO ON

    SET STATISTICS TIME ON

    --===== Embedded "2 Bite" method

    ;WITH

    cteCols AS

    (

    SELECT SomeInt, SomeLetters2,

    ROW_NUMBER() OVER(ORDER BY SomeInt, SomeLetters2) AS Seq,

    NULL AS TotRows

    FROM dbo.JBMTest

    )

    SELECT Seq, SomeInt, SomeLetters2, (SELECT COUNT(*) FROM dbo.JBMTest) AS TotRows

    FROM cteCols

    WHERE Seq BETWEEN @StartRow AND @StartRow + @PageSize - 1

    OR Seq = 0

    ORDER BY Seq

    --===== Turn off the timers

    SET STATISTICS TIME OFF

    SET STATISTICS IO OFF

    PRINT '--============================================================================='

    PRINT '-- The "Holy Grail" method'

    PRINT '--============================================================================='

    --===== Clear the guns

    DBCC FREEPROCCACHE

    DBCC DROPCLEANBUFFERS

    --===== Turn on the timers

    SET STATISTICS IO ON

    SET STATISTICS TIME ON

    --===== The "Holy Grail" method of getting a page of info

    ;WITH

    cteCols AS

    (

    SELECT SomeInt, SomeLetters2,

    ROW_NUMBER() OVER(ORDER BY SomeInt, SomeLetters2) AS Seq,

    ROW_NUMBER() OVER(ORDER BY SomeInt DESC, SomeLetters2 DESC) AS TotRows

    FROM dbo.JBMTest

    )

    SELECT Seq, SomeInt, SomeLetters2, TotRows + Seq - 1 AS TotRows

    FROM cteCols

    WHERE Seq BETWEEN @StartRow AND @StartRow + @PageSize - 1

    ORDER BY Seq

    --===== Turn off the timers

    SET STATISTICS TIME OFF

    SET STATISTICS IO OFF

    Here're the results...

    --=============================================================================

    -- The "No RBAR/No Join" method

    --=============================================================================

    DBCC execution completed. If DBCC printed error messages, contact your system administrator.

    DBCC execution completed. If DBCC printed error messages, contact your system administrator.

    (51 row(s) affected)

    Table 'JBMTest'. Scan count 1, logical reads 1985, physical reads 0, read-ahead reads 1, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'sysrowsets'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 1391 ms, elapsed time = 1653 ms.

    --=============================================================================

    -- A different No Join method

    --=============================================================================

    DBCC execution completed. If DBCC printed error messages, contact your system administrator.

    DBCC execution completed. If DBCC printed error messages, contact your system administrator.

    (50 row(s) affected)

    Table 'sysrowsets'. Scan count 50, logical reads 100, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'JBMTest'. Scan count 1, logical reads 1985, physical reads 0, read-ahead reads 1, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 1203 ms, elapsed time = 1371 ms.

    --=============================================================================

    -- Peso's Embedded "2 Bite" method

    --=============================================================================

    DBCC execution completed. If DBCC printed error messages, contact your system administrator.

    DBCC execution completed. If DBCC printed error messages, contact your system administrator.

    (50 row(s) affected)

    Table 'JBMTest'. Scan count 2, logical reads 3970, physical reads 0, read-ahead reads 1, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 1453 ms, elapsed time = 2253 ms.

    --=============================================================================

    -- The "Holy Grail" method

    --=============================================================================

    DBCC execution completed. If DBCC printed error messages, contact your system administrator.

    DBCC execution completed. If DBCC printed error messages, contact your system administrator.

    (50 row(s) affected)

    Table 'JBMTest'. Scan count 1, logical reads 1985, physical reads 0, read-ahead reads 1, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 7859 ms, elapsed time = 12073 ms.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thank you, everyone, for the eye-opening research. You guys have contributed some caveats to consider when trying to identify the best solution.

    Steve's lead-in was a bit misleading in that I do not consider the 'holy grail' method to be "the best" in all cases (although it remains the best in cases that I've had to work with). I was defining the 'holy grail' as being able to get the count and the page without adding a bunch of I/O overhead.

    Every time I've needed to implement server-side paging, the code would be paging the results of a query, sometimes simple, sometimes complex. As a result Jeff's 'no join' methods are out the window, which is a shame because that's a very cool trick! In the edge cases, these queries would page tens of thousands of records (nowhere near the 1-2 million record test cases we've seen on this thread, although I should have done tests like that for this article)

    I started looking for a better approach to getting a count because the underlying query that had to be returned to the user was very complex and expensive. Doing the 'embedded two-bite' method or others add considerable cost and time to execute the query.

    For my ends, the 'holy grail' approach is absolutely the fastest and most efficient technique and I am convinced that there are other people out there who can benefit from this technique (which is why I wrote the article) Will this approach be the best in all cases? No

    There is definitely a lot of good feedback and information in this thread. I'll update the article with the information you shared.

    SQL guy and Houston Magician

  • For my ends, the 'holy grail' approach is absolutely the fastest and most efficient technique

    Heh... sure... 7 to 8 seconds compared to just over a second. I'm sure your customers will think the same thing while they're waiting. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (3/13/2009)


    Heh... sure... 7 to 8 seconds compared to just over a second. I'm sure your customers will think the same thing while they're waiting. 😉

    LOL Thankfully the queries being paged aren't quite as large as those you used to give my technique a beating. 😉

    This is the right tool for me only because the paged result sets are fairly small, and the cost of the query that builds the resultset is what causes my customers to wait.

    Your comments caused me to take another look at my procs, and the worst case scenario is elapsed time overhead in the tens of milliseconds. I, and my customers, can live with that.

    I've submitted a minor revision to the article, suggesting that this approach is only beneficial in a narrow set of circumstances.

    SQL guy and Houston Magician

  • Just for the benefit of the folks talking about grabbing the count of all rows in the table separately, note that in a lot of cases we can't look at sys.indexes or the more reliable sys.dm_db_partition_stats because our queries have WHERE clauses or are JOINed to other tables.

  • Robert Cary (3/14/2009)


    Jeff Moden (3/13/2009)


    Heh... sure... 7 to 8 seconds compared to just over a second. I'm sure your customers will think the same thing while they're waiting. 😉

    LOL Thankfully the queries being paged aren't quite as large as those you used to give my technique a beating. 😉

    This is the right tool for me only because the paged result sets are fairly small, and the cost of the query that builds the resultset is what causes my customers to wait.

    Your comments caused me to take another look at my procs, and the worst case scenario is elapsed time overhead in the tens of milliseconds. I, and my customers, can live with that.

    I've submitted a minor revision to the article, suggesting that this approach is only beneficial in a narrow set of circumstances.

    Heh... understood... but the problem is that other folks, who may not have such limited numbers of rows, are reading your good article and I want them to know that your method is rather slow in the face of scalability.

    To prevent problems like this, I will always write code against a million row test table. It's a good habit to get into so that you're not called up at midnight on the last day of each month by your rather ticked of boss wondering why your code has, yet again, caused him to miss the SLA deadlines for month end runs.

    Never justify performance challenged code with low row counts. It's an accident waiting to happen.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • The original article was very well written and it's sooo nice to statistics to back up the arguments presented.

    However, having read the forums here I have to agree with Peso that overall time is perhaps more important than just I/O reads, certainly in terms of customer experience anyway!

    I've tried Peso's method on several tables of real data with rows ranging from a few hundred to over 500,000 and in every single case Peso's method is quicker by a factor of at least 10.

    Thanks for the post, it has made a massive difference to my paging search times.

  • Thank you for the feedback!


    N 56°04'39.16"
    E 12°55'05.25"

  • Hello all,

    After benefitting from the code examples and explanations above, I thought I'd make a contribution.

    Below is a generic stored procedure that will retrieve the page of rows based on the parameters fed to it. It determines whether the table called for is in fact a table or a view. If it's a table, it uses the sys-tables to get the total row count (thanks to a post above this one!) and if it's a view, it uses the original strategy of calculating the inverse row # for a given row. I didn't compute the total rows because I thought whatever was calling it would only have to make the computation once rather than once-per-row if I had done it here.

    Before the Efficiency Police hall me away, this was slapped together as a proof of concept and is not (in any way) presented as the best way of doing this. And there is a rule. Your ORDER-BY clause must include ASC where appropriate. This made inverting the order-by merely ugly, down from absolutely hideous.

    [Code]

    set ANSI_NULLS ON

    set QUOTED_IDENTIFIER ON

    go

    ALTER PROCEDURE [dbo].[GetPage]

    -- Add the parameters for the stored procedure here

    (

    @TableName varchar(1000)

    ,@Fields varchar(1000)

    ,@OrderBy varchar(1000)

    ,@StartPosition int

    ,@PageLen int

    )

    AS

    Begin

    DECLARE

    @strSQL varchar(max)

    ,@RowCount int

    ,@RightNow DATETIME

    ,@TableType varchar(20)

    ,@RevOrderBy varchar(1000)

    Set @RightNow=GETDATE()

    Set @PageLen = @PageLen - 1

    if not EXISTS (select * from INFORMATION_SCHEMA.tables where table_name = @TableName)

    begin

    Select 'Table/View not found' as Error

    return (-1)

    end

    select @TableType = TABLE_TYPE from INFORMATION_SCHEMA.tables where table_name = @TableName

    if @TableType = 'BASE TABLE'

    begin

    SELECT @RowCount=P.Rows

    FROM sys.partitions P

    INNER JOIN sys.indexes I ON (P.object_id = I.object_id) AND (P.index_id = I.index_id)

    INNER JOIN sys.objects O ON (P.object_id = O.object_id)

    WHERE (I.type IN(0,1)) AND (O.name = PARSENAME(@TableName, 1) AND

    O.[schema_id] = IsNull(SCHEMA_ID(PARSENAME(@TableName, 2)), O.[schema_id]))

    ORDER BY O.name

    set @strSQL =' SELECT * ' +

    ', ' + convert(varchar(10),@RowCount) + ' as [RowCount] ' +

    ' FROM (SELECT ' + @Fields + ', ROW_NUMBER() OVER(ORDER BY ' + @OrderBy + ') AS RowNumber ' +

    ' FROM [' + @TableName + ']) as [' + @TableName + '_A] ' +

    ' WHERE RowNumber BETWEEN ' + convert(varchar(10),@StartPosition) +

    ' AND ' + convert(varchar(10),(@StartPosition + @PageLen)) + ' ORDER BY RowNumber'

    end

    else

    begin

    set @RevOrderBy = upper(@OrderBy) + ','

    set @RevOrderBy = replace(replace(replace(@RevOrderBy ,' ASC,',' ~A,'),' DESC,',' ASC,'),' ~A',' DESC')

    set @RevOrderBy = left(@RevOrderBy ,len(@RevOrderBy)-1)

    set @strSQL =' SELECT *' +

    ' FROM (' +

    'SELECT ' + @Fields + ', ROW_NUMBER() OVER(ORDER BY ' + @OrderBy + ') AS RowNumber, ' +

    ' ROW_NUMBER() OVER(ORDER BY ' + @RevOrderBy + ') AS RowNumberInverse ' +

    ' FROM [' + @TableName + ']) as [' + @TableName + '_A] ' +

    ' WHERE RowNumber BETWEEN ' + convert(varchar(10),@StartPosition) +

    ' AND ' + convert(varchar(10),(@StartPosition + @PageLen)) + ' ORDER BY RowNumber'

    end

    --SELECT SQL=@strSQL

    EXEC(@strSQL)

    End

    [/code]

    Enjoy!

  • werner.broser said

    Nice method, but works only for unique fields!

    -- I agree.

    When the order-expression does not bear an unique value, "The Holy Grail" can make mistakes:

    DECLARE @t1 TABLE ([x] INT, [y] INT)

    INSERT INTO @t1 ([x], [y])

    SELECT 1, 1

    UNION ALL

    SELECT 1, 2

    SELECT *

    FROM (SELECT *, ROW_NUMBER() OVER (ORDER BY [x] ASC)

    + ROW_NUMBER() OVER (ORDER BY [x] DESC)

    - 1 AS [Sum]

    FROM @t1) t

    The query result gives

    x y Sum

    1 1 1

    1 2 3

    which obviously cannot satisfy us.

    For rows over which order-expression gives the same value, their relative order will not change regardless of "ASC" or "DESC" option, but only following the order how they are physiclely stored. The previous example proves this.

    We should know this limit and do not apply "The Holy Grail" in the wrong place.

  • Thank you for this. I've used it now, but haven't studied reads and performance gains (or losses) yet. Realized though that the column(s) used for sorting has to be unique in order for the total row count to be accurate. If there are several equal records in the sorted column, the total row count will not be accurate. You have to make sure that the sorting is 100% opposite to the initial sorting. Just a simple addition to the solution after trying it out. Thank you again!

    / Peter.

  • Sorry for stating an already noticed "feature". Only read the earliest posts...

  • Really nice solution for the total row count and paging problem; and you can't miss the analogy to the classical Gauss argument for the sum of the N first integers 1+2+..+N= (N+1)*N / 2

    1 + 2 + ... N

    N + (N-1) ... 1

    🙂

  • Hi Friend,

    Can you get me the other alternative not using trigger

Viewing 15 posts - 31 through 45 (of 64 total)

You must be logged in to reply to this topic. Login to reply