Technical Article

Find Nth Max-Min Record - The Efficient Way!

,

It is quite often that we need to find Nth (5th or 8th or 10th) maximum or minimum number of record from the record set.
The solution is different in MS SQL than in Oracle because we have RowNum in Oracle.
Some time ago I got the same problem with a massive table of more than 20 Million records and 56 varchar columns and 1 int column.
I tried different solutions (got from the Internet!) but was disappointed with the performance.
Then I decided to spoil my hands and came up with this solution that is most efficient solution for this problem.


CREATE TABLE [Risk] (
[Col001] [varchar] (200) NULL ,
[Col002] [varchar] (200)  NULL ,
[Col003] [varchar] (10) NULL,
[RowNum_Risk] [int] IDENTITY (1, 1) NOT NULL ,
CONSTRAINT [PK_risk] PRIMARY KEY  CLUSTERED 
(
[RowNum_Risk]
)  ON [PRIMARY] 
) ON [PRIMARY]
</pre>
<pre>
Now lets insert some records into the table;
INSERT INTO Risk Values('AAA1','BBB1','CCC1',1);
INSERT INTO Risk Values('AAA2','BBB2','CCC2',2);
INSERT INTO Risk Values('AAA3','BBB3','CCC3',3);
INSERT INTO Risk Values('AAA4','BBB4','CCC4',4);
INSERT INTO Risk Values('AAA5','BBB5','CCC5',5);
INSERT INTO Risk Values('AAA6','BBB6','CCC6',6);
INSERT INTO Risk Values('AAA7','BBB7','CCC7',7);
INSERT INTO Risk Values('AAA8','BBB8','CCC8',8);
INSERT INTO Risk Values('AAA9','BBB9','CCC9',9);
INSERT INTO Risk Values('AAA10','BBB10','CCC10',10);


Now the query is;

SELECT Col001,Col002,Col003,RowNum_Risk FROM Risk 
WHERE RowNum_Risk = (SELECT MIN(RowNum_Risk) FROM (SELECT DISTINCT TOP 5 RowNum_Risk FROM Risk ORDER BY RowNum_Risk DESC)B)

This query will return 5th Max record from the table. If you wish to find 8th or 10th or any number, just replace 5 with the number you want.
Now Lets look at the execution plan of the query.

MS SQL is using Index Scan for <BR>SELECT DISTINCT TOP 5 RowNum_Risk FROM Risk ORDER BY RowNum_Risk DESC <BR>and the getting top N record.
It is then using Index Seek for the outer query and producing the result.

Conslusion:
This is the best solution I found for selecting Nth Max or Min number from a massive table in MS SQL Server. How ever when you are trying to find say 1000000th Max record from the data set then the query execution time will be more. But it is in minutes and not in hours!

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating