Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase «««34567»»

SQL Server 2005 Paging – The Holy Grail Expand / Collapse
Author
Message
Posted Friday, May 7, 2010 11:52 AM


SSCoach

SSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoach

Group: General Forum Members
Last Login: Today @ 12:11 AM
Points: 17,823, Visits: 15,754
Nice article. Thanks for sharing.



Jason AKA CirqueDeSQLeil
I have given a name to my pain...
MCM SQL Server, MVP


SQL RNNR

Posting Performance Based Questions - Gail Shaw
Post #918239
Posted Friday, May 7, 2010 1:49 PM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Friday, July 25, 2014 9:43 AM
Points: 20, Visits: 20
I use a stored procedure to run the paged query, It runs the query twice; once to count, once to get the paged result set. I then return the count as the return value from the stored procedure and the result set is returned to the caller.

I also use Xml to pass in the search parameters and attributes for the paging information, but that would be a topic for another day.

Barry
Post #918339
Posted Friday, May 7, 2010 3:28 PM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Wednesday, June 1, 2011 8:25 AM
Points: 15, Visits: 52
This article introduces a novel approach (in my experience) to tackling an old problem, and I think the concept behind it is great. However, the article could have been much better written, with greatly improved examples.

First, I would have liked to have seen a table in a sample database used, rather than [INFORMATION_SCHEMA].columns. This confused me at first.

Second, in the "grail" query, the second "row_number" column is aliased as "totrows". This should have been named something like rseq, since it actually represented the reverse sequence number, not the total number of rows. This was confusing too.

Third, I could not find mention of the fact that the columns in the OVER() clause of the "grail" query MUST include the primary key, and the columns must be listed in the same order in both OVER() clauses or it throws the calculation off.

Fourth, a lot of confusion could have been eliminated, and it would have been a lot more useful to the reader, if the article had also included some query examples with filter criteria to demonstrate the effect on the row count (viz. that # of rows returned was not the number of rows in the entire table).

Lastly, and this is a minor point, I was not familiar with the WITH construct. Speaking only for myself, it would have been nice if this had been explained too.

I give 5 stars for the concept, and 2 stars for the writing.
Post #918383
Posted Friday, May 7, 2010 5:14 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Friday, July 25, 2014 4:55 PM
Points: 1, Visits: 56
One small observation from trying this: you have to be sure that the ORDER BY fields are unique, otherwise the Ascending and Descending sequences may not mirror each other, so the sum of the UP and DOWN row_number() fields is not guaranteed to be the total row count.

With that in mind, though, the central idea here does give another way to calculate a MEDIAN - sort for row_number on the value you are interested in (possibly including an ID in the ORDER BY to separate duplicate values) , then your median is the value where UP and DOWN are the same, or , if there is an even number of values, it's the average of those two where abs(UP - DOWN) = 1

Both these cases reduce to
avg(value).....WHERE (UP - DOWN) between -1 and 1

This snippet finds the median of the field enSum, in the table called Enrolments:

with RankedEnrol
As
(
Select enSum
, row_number() over (ORDER BY enSum, enID) UP -- identity is included in the ORDER BY to ensure uniqueness
, row_number() over (ORDER BY enSum DESC, enID DESC) DOWN
FROM Enrolments
)
SELECT avg(enSum) MedianEnrol
from RankedEnrol
WHERE (UP - DOWN) between -1 and 1

Post #918421
Posted Saturday, May 8, 2010 12:31 PM
SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Today @ 7:31 AM
Points: 1,579, Visits: 1,858
brian lewis (5/7/2010)
... the central idea here does give another way to calculate a MEDIAN - sort for row_number on the value you are interested in (possibly including an ID in the ORDER BY to separate duplicate values) , then your median is the value where UP and DOWN are the same, or , if there is an even number of values, it's the average of those two where abs(UP - DOWN) = 1

Both these cases reduce to
avg(value).....WHERE (UP - DOWN) between -1 and 1

This snippet finds the median of the field enSum, in the table called Enrolments:

with RankedEnrol
As
(
Select enSum
, row_number() over (ORDER BY enSum, enID) UP -- identity is included in the ORDER BY to ensure uniqueness
, row_number() over (ORDER BY enSum DESC, enID DESC) DOWN
FROM Enrolments
)
SELECT avg(enSum) MedianEnrol
from RankedEnrol
WHERE (UP - DOWN) between -1 and 1


Nice!
Post #918538
Posted Thursday, May 20, 2010 8:59 AM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Wednesday, July 16, 2014 2:47 PM
Points: 32, Visits: 105
Numerous people have noted that the "holy grail" pattern does not work when there are duplicate values in the sort column.

a simple work around for this is:

With CTE as (
select *, row_number() OVER (ORDER BY nonunique ASC) seq
from MyTable
),
with Reverse as
(
select *, row_number() OVER (ORDER BY seq DESC) tot
)
select fields, seq+tot-1 as totalRecs
from Reverse

In my tests, this did not add any noticeable overhead over the standard "Holy Grail" pattern.

I find this does work well for small sets, and contrary to another posters statement, I disagree that all your SQL should be optimized to 1 million+ records. You should use the correct SQL and indexes for your workload, and you should have a good understanding of your workload, expected growth, access pattern, read / write ratio ,etc...

If I have a system that has little data, but a high transaction count, wicked processors and a slow disk sub-system, this pattern is ideal. For large amounts of data, this pattern has been clearly shown to break down and would not necessarily be the correct choice.

I'd be interested in seeing differences in performance with real-world examples of complex queries, as paging is typically used on search results, and search queries rarely, if ever, read from a single table as most of the examples both for and against this pattern have dictated.
Post #925207
Posted Sunday, December 5, 2010 11:40 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Tuesday, October 7, 2014 2:06 PM
Points: 4, Visits: 15
nice solution. and very nicely written too, but the holy grill solution is not giving the correct 'Total count of records' in case if the 'ROW_NUMBER() OVER(ORDER BY' column is not the unique column.
Post #1030457
Posted Sunday, December 5, 2010 11:46 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Tuesday, October 7, 2014 2:06 PM
Points: 4, Visits: 15
Thanks SSC Rookie. your solution helped me.
Post #1030461
Posted Monday, December 6, 2010 7:27 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 8:08 AM
Points: 35,366, Visits: 31,906
Paul Muharsky-474732 (5/20/2010)
I disagree that all your SQL should be optimized to 1 million+ records.


Heh... Yeah... I know I'm responding to an old post but the kind of attitude above why there are so many posts on this site that start with "I have this performance problem. Can anyone help"? If you look back at Peso's code, 4 milli-seconds vs almost 6 seconds isn't something that anyone in their right mind would disagree with.

Write high performance code as if your life depended on it because saying "That's not my job" may be a self-fulfilling prophecy.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1030967
Posted Monday, December 6, 2010 8:15 PM


SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: 2 days ago @ 7:35 AM
Points: 2,397, Visits: 3,417
It seems the dual TOP/ORDER BY was the most efficient approach by the time.
I tested the new Denali OFFSET/FETCH and I got the exact same number of reads for @StartRow = 5000, but 2 ms instead of 4 ms.

SELECT		Number
FROM dbo.TallyNumbers
ORDER BY Number
OFFSET @StartRow - 1 ROWS
FETCH NEXT 50 ROWS ONLY




N 56°04'39.16"
E 12°55'05.25"
Post #1030975
« Prev Topic | Next Topic »

Add to briefcase «««34567»»

Permissions Expand / Collapse