RE: SQL Server 2005 Paging – The Holy Grail

Ten Centuries

Points: 1112

May 7, 2010 at 9:32 am

I haven't implemented this yet, but my thoughts on paging is this:

1) Store each unique combination of request parameters in a table (parent)

2) Run the query, store the results in another table with a foreign key to the parent table. Since this result is now read-only and there will be no provision for updating the rows, update the parent row's resultrowscount for easy access to the total.

3) Also set cache timeout values on the parent. Suppose large result sets that took a long time to run are given large cache timeout while smaller results are given shorter timeouts. Perhaps the data itself determines its life: a query on real-time data has is stale in 3 minutes while first quarter report (run in may) has a life of 30 days or more.

4) Pages are retrieved from the prefetched (and denormalized) child table.

Further requests for the same query are served from the prefetched result if the results are within the staleness tolerance. In the case the data is stale, the old key is marked obsolete and the process starts over. (This allows a nice audit/analysis to be done on cache hits and results rather than deleting old results)

If business rules allow for Top N to prevent absurdly large results then go for it, otherwise the cache at least mitigates the need to redundantly query the results from the source system.

I think a better question is to ask if users really need pages of data more than they need better results in general. "Eyeballing" long lists is a huge waste of time. Thanks to Google (et al.) we expect relevant results of search to be on the first page. If we're not sure what we're looking for, we might desperately look at page 2 or 3. After that, we hopefully have an idea how to search more specifically for better results. This makes search an iterative process of refining criteria rather than a brute-force perusal of pages of results. I was going to make this concept the focus of my seminar paper this semester but the day the topic proposal was due I read the SQLServerCentral article on genetic algorithms in T-SQL so I did that instead.

I urge anyone thinking of implementing results paging to consider how/why the user can be better served with a more conversational and iterative process. Our job should be to fetch what they really want from our database (even when they aren't sure what they want). Simply giving the user multiple pages of what they asked for makes us look lazy and makes them work much harder than they should.