• peter (1/10/2009)


    A primary key (identity or otherwise) cannot be used due to any sensible order by clause in the base query. Remember here that an order by on just the identity/primary key is meaningless, and not a case that is valid to consider for optimization.

    This is a bit over-generalized statement that the order by identity is meaningless. Just one example: imagine a data-logger application. There is an incoming stream of measurements, and absolutely guaranted is that the later measurement (with later datetime) will be stored under the greater identity. Then a user needs to get a page corresponding to desired date and move left or right on the next pages. The number of pages is enormous and a user don't care about the number, he just wants to move left/right from the chosen timepoint.

    Many apps could be reduced to this model. Even that about contacts with billions of records, but that's another story.

    ...Note that paging over a very large set is something you do not want as the results will be next to meaningless for the end user and end up not being used at all. In such a scenario you want the user to enter some filter restrictions so that you can use those in your base query to get the desired smaller set that can be page over efficiently. Often used filtering is to examine only one year at a time, filter by some category or require a minimum length for a search string (or any combination of those).

    Filtering by, say, category won't break the idea. Still it will be the same initial Id and the page size, just WHERE clause will change. Different order would be an issue, that could be solved by creating sort of "custom index column" in adition to the identity. Maintaining such "custom index" will be a headache, but when the number of requests is very big it could help paging over the huge date.

    Thus beyond paging over a moderately sized set, it is no longer a database design/code issue, but an application design issue instead. You got to combat the problems where they are caused and not try to fix resulting symptoms in the database by making assumption based optimizations.

    Why not "making assumption based optimizations" and why to treat an app designer like an enemy? 🙂