Using OFFSET for Paging

  • Comments posted to this topic are about the item Using OFFSET for Paging

  • We will soon be able to drop support for SQL 2008R2 from our product making SQL 2014 the minimum and thus we could use this newer paging syntax. However currently in our web pages above the results we display:  1 to 10 of 2310, for example and a up to 11 paging links, like, for example:  prev 1 2 3 4 5 6 7 8 9 10 11 next, which ever page you are on we try to main 11 links with 5 either side of the current page.

    Because of the above output we need to know the total number of records in order to render the paging info and links.

    Is there a way to get the total number of rows at the same time as just getting 1 page full of rows?

  • I have used the following pattern for similar requirements

    CREATE PROCEDURE Foo
    @FilterCondition ....
    , @PageNum int = 1
    , @PageSize int = 100
    , @TotalPages int = null output
    , @TotalRecords int = null output
    AS
    BEGIN
    SET NOCOUNT ON;

    SELECT ...
    FROM ....
    WHERE ... @FilterCondition
    ORDER BY ...
    OFFSET (@PageNum-1) *@PageSize ROWS -- No of Rows to Skip
    FETCH NEXT @PageSize ROWS ONLY; -- No of Rows to Return

    -- IF You need to know the size of the data ....
    SELECT @TotalPages = (COUNT(*) -1) /@PageSize +1
    , @TotalRecords = COUNT(*)
    /*** EXACTLY THE SAME FROM AND WHERE AS THE ABOVE QUERY ***/
    FROM ....
    WHERE ... @FilterCondition

    END
    GO

    • This reply was modified 4 years, 6 months ago by  DesNorton.
  • From the performance article this one links to it looks like the answer is no - it suggests that doing a count() in a separate query is the best option to go for.

  • Oh, what a recipe for disaster!  Since you are requerying the data every time, you have NO WAY to know if the offset is still valid.  You are going to the 10th row, for instance, but what if another row has been inserted between row 1 and 10, or deleted?  At that point, your data integrity is completely gone.  Please, never do this for data unless the data set is guaranteed to be completely static.

     

  • rick-1071254 wrote:

    Oh, what a recipe for disaster!  Since you are requerying the data every time, you have NO WAY to know if the offset is still valid.  You are going to the 10th row, for instance, but what if another row has been inserted between row 1 and 10, or deleted?  At that point, your data integrity is completely gone.  Please, never do this for data unless the data set is guaranteed to be completely static.

    Surely, as in most things SQL, it depends?

    If it were some public website and someone was doing a search for something the fact that 1 person may get a slightly different result 1 time due to an insert vs. the next time they search is a price worth paying for improved performance of the search query.

    • This reply was modified 4 years, 6 months ago by  peter.row.
  • Well, you could say that about life, in general.  But I the trade-off here is data integrity versus performance.  Wrong data versus fast data.  It is not that close for me.  Further. the fact that this article does not mention the trade-off or even acknowledge it exists is a real problem.

    I would agree with you that a fully informed choice could be appropriate.  But someone reading this article alone won't get that opportunity.

    • This reply was modified 4 years, 6 months ago by  rick-1071254.
  • peter.row wrote:

    We will soon be able to drop support for SQL 2008R2 from our product making SQL 2014 the minimum...

    I strongly recommend that you upgrade that to the latest and greatest and simply set the compatibility level to whatever still works.  You may be surprised to find that the latest and greatest actually does work (or not... just change the setting if it doesn't).

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden wrote:

    peter.row wrote:

    We will soon be able to drop support for SQL 2008R2 from our product making SQL 2014 the minimum...

    I strongly recommend that you upgrade that to the latest and greatest and simply set the compatibility level to whatever still works.  You may be surprised to find that the latest and greatest actually does work (or not... just change the setting if it doesn't).

    Unfortunately that is not currently possible since MS completely screwed us with SQL Server Reporting Services which we use for reporting. You can now only have 1 instance of SSRS per server and on top of that MS changed the way in which the custom security works. We have an instance of SSRS per customer and we have to implement custom security because it's a non-domain web product. We have done some investigation on the security side of the changes but the SSRS is a deal breaker because we'd have to come up with a completely different security mechanism to way of integrating it to cope with that, let alone the migration issues.

  • rick-1071254 wrote:

    Well, you could say that about life, in general.  But I the trade-off here is data integrity versus performance.  Wrong data versus fast data.  It is not that close for me.  Further. the fact that this article does not mention the trade-off or even acknowledge it exists is a real problem.

    I would agree with you that a fully informed choice could be appropriate.  But someone reading this article alone.  won't get that opportunity.

    My harsh opinion on the subject would be that if you read an article like this without considering the effects then that is on you not the article author.

    And by "you" I mean anybody reading the article, not you personally responding to my comment.

    • This reply was modified 4 years, 6 months ago by  peter.row.
  • My harsh opinion on the subject would be that if you read an article like this without considering the effects then that is on you not the article author.

    I do not disagree.  But many people will not consider the effects on data integrity - especially since it is somewhat orthogonal to the performance topic discussed.  Hopefully they will at least read the comments.

    And, just for clarification, inserting a new row is not the only thing that will damage data integrity of the result set.  If there are any filters involved, an earlier row may no longer be included in the result set.  And, of course, deleted rows will also throw off the count.

    As far as communication responsibilities go, it's a one-to-many, with the author being the one and the readers the many.  It is both more prudent and efficient to state things properly in the initial article.

  • rick-1071254 wrote:

    Oh, what a recipe for disaster!  Since you are requerying the data every time, you have NO WAY to know if the offset is still valid.  You are going to the 10th row, for instance, but what if another row has been inserted between row 1 and 10, or deleted?  At that point, your data integrity is completely gone.  Please, never do this for data unless the data set is guaranteed to be completely static.

    Recipe for disaster? A better recipe would be trying to retrieve all 10 million records to your client in a "static" result set, which won't be "static" anyway because behind the scenes, in the database, the set is changing...so if I want to see what's new/changed, I have to retrieve, again, all 10 million+ records, every time I want to see what changed? No, I don't think so.

    That said I do somewhat agree with you...but the problem in my opinion (and, I'm guessing, in yours) is paging that is used without consideration for concurrency. That certainly can be a recipe for disaster, or at least for some serious confusion. The two really need to go together, and when they do it can work reasonably well.

  • I know this sounds obvious, so apologies if it is for whomever reads this.  I just wanted to say it out loud if it's not obvious...

    This reminds me of a common request to get an "up to date row count".  It's not possible to have an 100% up to date row count on a non-static table.

    The same holds true with paging on non-static tables.  Either the GUI user needs to be informed that data can change as modifications occur while they're paging or the table needs to be temporal so that the start time of the current session (or whatever) can be used as the PIT (Point-in-Time) that the paging is returning.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • It's all a matter of how likely a possibility.

    Recipe for disaster? A better recipe would be trying to retrieve all 10 million records to your client in a "static" result set, which won't be "static" anyway because behind the scenes, in the database, the set is changing...so if I want to see what's new/changed, I have to retrieve, again, all 10 million+ records, every time I want to see what changed? No, I don't think so.

    You can stop right after "retrieve all 10 million records to your client".  You need go no farther than this to call out a truly horrid application and database design.  Databases are designed to allow you to limit your results to meaningful data.  This is an ETL job, not an interactive query.  There are lots of ways to do that.

    Databases (at least mature relational databases) are also designed to give you a consistent view of data, regardless of other simultaneous write activity.  So changes in the database once a transactionally consistent read started would not show up, unless you forced it.

    You can also design your data to address a use case.  If you really want to see what data has been changed in a certain amount of time frequently, you can simply add a timestamp column or use a time created on an artificial key.  I would refer to the harsh opinion of another poster in this regard - if you "design" your data structures with no thought as to use cases, you deserve what you get.

    And please do not take this as a personal attack.  I've been toiling in these DBMS fields for a long time, and, to my continual regret, data integrity is frequently not even considered.  As I mentioned earlier, if someone consciously makes the decision to accept data which may not be correct, more power to them.  If they make a design decision without understanding this, woe is them.

  • Jeff -

    There are other options out there.  Databases that use multi-version read consistency do exactly what you are asking - present an accurate picture of the state of data at a particular point in time, be it a simple count or a complete result set, even through paging.  Your DBMS of choice may not offer this, but others do.

     

Viewing 15 posts - 1 through 15 (of 43 total)

You must be logged in to reply to this topic. Login to reply