Server Side Paging With SQL Server 2005

  • Comments posted here are about the content posted at http://www.sqlservercentral.com/columnists/jSebastian/3181.asp

    .

  • Hi all,

    again, the source code listing does not show up well in IE. The lines are wrapped at places where they are not supposed to be.

    The code shows up correctly in Mozilla.

    If you are using IE, i would suggst to click on the link 'code' and read the source code from there.

     

    .

  • Jacob,

    I've read seemingly a bazillion SQL Server pagination articles/tutorials, and I must say, I think that yours is head and shoulders above the rest. Very clearly worded, and more specifically, extremely well commented code. Thank you!

  • Good article, although every server based pagination sample I read seems to skip the fact that in almost every situation the "user" wants to know how many records there are in total, including any specified filters. This allows any client side paging mechanism to say "you're on page 3 of 15" etc.

    Is there any way to do this without using a temp table/variable as this seems to be a much less elegant solution than the ROW_NUMBER paging shown here ?

  • Hi noggin,

    I had found 3 points that I must add to the above article after playing with a few real-life assignments.

    1. Find the total number of records affected by the current filters
    2. Support for sorting ascending and descending (usually, as a result of clicking on the header of a grid/table at the front end)
    3. Some ASP.NET code which shows how this code would work with an ASP.NET grid control (includes how to display custom page numbers at the front end  etc..)

    I have a follow-up article coming up, which shows some sample code which may be helpful to clear those 3 points.

    thanks

    Jacob

    .

  • Jacob,

    I read your article what I found very useful.

    Now I have a question related to joins. Please consider this two examples:

    WITH EmployeesOrders AS (

     SELECT 

       Employees.EmployeeID,

       Employees.FirstName,

       Employees.LastName,

       Orders.OrderDate,

       Orders.ShippedDate,

       ROW_NUMBER()OVER (ORDER BY LastName) AS RecNum

     FROM Orders INNER JOIN Employees ON Orders.EmployeeID = Employees.EmployeeID)

    SELECT * FROM EmployeesOrders

    WHERE RecNum BETWEEN ((@PageNumber - 1) * @PageSize) + 1 AND @PageNumber * @PageSize

    ORDER BY RecNum

    and

    WITH EmployeesOrders AS (

     SELECT 

       Employees.EmployeeID,

       ROW_NUMBER()OVER (ORDER BY LastName) AS RecNum

     FROM Employees)

    SELECT

      Employees.EmployeeID,

      Employees.FirstName,

      Employees.LastName,

      Orders.OrderDate,

      Orders.ShippedDate

    FROM EmployeesOrders

      INNER JOIN Employees ON EmployeesOrders.EmployeeID=Employees.EmployeeID

      INNER JOIN Orders ON Orders.EmployeeID = Employees.EmployeeID

    WHERE RecNum BETWEEN ((@PageNumber - 1) * @PageSize) + 1 AND @PageNumber * @PageSize

    ORDER BY RecNum

    In the first query all the join and select stuff is done in the CTE (joining maybe 100000 records from a foreign table)

    but in the second query the CTE returns only the ID needed to perform the selecting and the joining.

    In this case the joins will have to deal only with the number of records given from the @Pagesize (maybe 15 or 20) and I think that this would be much more efficient in a 100000 records scenario.

    Having to implement this in my current project which option should I choose ?

    Many thanks

    Franco Pizzinini

  • I found this a really useful article since I hadn't played much with CTEs or ROW_NUMBER() before. Both of those concepts have suddenly 'clicked' and I really see the benefits of your approach. Just a few comments though, and apologies for being a bit picky:

    Firstly, when you're applying the LastName, Title and City filters your LIKE clause starts with a wildcard - that's probably fine for small tables, but in large tables that's going to cause a table scan which will kill performance. If that's the application requirement then so be it, but for an example like yours I think you'd be better off removing the leading % so as not to encourage this practice. The whole idea of paging through result sets is to enable efficient IO and network bandwidth usage, particularly over large tables.

    Secondly, several paging solutions I have seen do not scale at all well as the number or records in a table increases. Do you have any idea how well SQL Server optimizes the ROW_NUMBER function and how well your solution scales over big tables? I'll do a little investigation of my own when I get a chance, I've got some tables with several million records in which I'd be interested in doing some tests on.

  • Hi

    I quite like the exposition of the rownumber and over stuff.  Interesting.

    What I would ask is whether the performance issues relating to the and(x is null or x = y) that were experienced in SQL2000 (ie knocking it down to a scan) have been resolved on 2005?

     

    Cheers

  • Cheers Jacob, I look forward to the follow-up.

  • I think that you can limit the initial rows you get in the CTE if you order the query and use the TOP clause

    ;WITH emp AS (SELECT TOP (@PageNumber * @PageSize)

    /* retrieve only a limited result set

    if you need records 31 -> 60, you retrieve in the first step only 60 rows

    */

    CASE

    WHEN @SortOrder = 'Title' THEN ROW_NUMBER()OVER (ORDER BY Title)

    WHEN @SortOrder = 'HireDate' THEN ROW_NUMBER()OVER (ORDER BY HireDate)

    WHEN @SortOrder = 'City' THEN ROW_NUMBER()OVER (ORDER BY City)

    -- In all other cases, assume that @SortOrder = 'LastName'

    ELSE ROW_NUMBER()OVER (ORDER BY LastName)

    END AS RecID,

    LastName,

    FirstName,

    Title,

    HireDate,

    City,

    Country,

    PostalCode

    FROM employees

    WHERE

    (@LastName IS NULL OR LastName LIKE '%' + @LastName + '%')

    AND

    (@Title IS NULL OR Title LIKE '%' + @Title + '%')

    AND

    (@City IS NULL OR City LIKE '%' + @City + '%')

    /*specify order */

    ORDER BY

    CASE

    WHEN @SortOrder = 'Title' THEN Title

    WHEN @SortOrder = 'HireDate' THEN HireDate

    WHEN @SortOrder = 'City' THEN City

    ELSE LastName

    END

    )

    SELECT

    RecID, -- This column is only for debugging/testing purpose.

    LastName,

    Title,

    HireDate,

    City

    FROM emp

    /*

    you need only the last @PageSize rows from your result set

    */

    WHERE RecID > (@PageNumber - 1) * @PageSize)

    ORDER BY RecID

  • The leading wild card could be expensive as noted above. I'd also worry about users actually paging through the data causing a possibly expensive query to be run over and over. I've seen solutions that optimized for the first page only as it's obviously always used, and solutions that pushed primary keys, rownumber, and session id into a permanent table to make second and later pages faster to retrieve. All tradeoffs of course.

  • I thought this was an excellent article with a simple solution to a complex problem.  There are definitely possibilities for performance imporvement, but the solution supplied to the problem was easy to understand and I could implement it immediately if needed.  Many times articles are not as clear and concise as this one.

    As far as performance enhancements, well, if there weren't any suggestions made by folks visiting this site, the site wouldn't be as good a resource as it is.

  • Franco,

    I was just looking at a similar trade-off. I found that for my needs, the answer depends (big surprise ).

    It depends on what columns you are using to execute your row_number() over and what indexes you have.

    If you are doing row_number() over the column(s) that your table is clustered on, then you should retrieve all of the necessary columns from that table within the definition of the CTE. Otherwise you are grabbing the IDs in one pass and then going back to get the associated columns from those rows. At least this is the plan I saw on my machine.

    However if you have a covering nonclustered index for row_number() and you're going to have to do bookmark lookups anyway, I believe the best approach is to just return the columns needed for the lookup from within the CTE and then join against it from outside the CTE.

    It doesn't hurt to double check the plan on your machine, but that is the conclusion I came to.

    Hope that helps!

    -Adrian

  • I know it is just a matter of personal preference, but instead of parameters like @PageSize and @PageNumber I use @FirstItem and @NumItems. All of the array operations in c# use a beginIndex/length style so I prefer to stay similar to that in my data access layer, and leave the notion of "Pages" to the presentation layer.

  • Nice article.  I particularly liked the showcasing of the usefullness of the new ROW_NUMBER() function in SQL Server 2005.  I typically stick to developing on the SQL Server side and don't do much on the web side, but I have seen some really narly solutions for accomplishing the sort order with pagination on SQL Server 2000 which involved a ton of dynamic SQL.  It would be really interesting to see what it would take to duplicate this elegant stored procedure without the use of the ROW_NUMBER() over (order by x) function. 

    Anyway, thanks for the article.

     

    Dan

Viewing 15 posts - 1 through 15 (of 61 total)

You must be logged in to reply to this topic. Login to reply