April 6, 2010 at 1:09 pm
In an interview yesterday I was given this question:
Given a table "Test" that has three columns
id (int, identity key)
name (varchar(25) )
score (int)
Create a view that displays id, name, score and rank, where the rank is a sequential numbering of the scores in descending order.
Here is some sample data:
id name score
== ===== =====
1 John 25
2 Mary 65
3 Dave 35
Here is the output based on the above data:
id name score rank
== ==== ==== ===
2 Mary 65 1
3 Dave 35 2
1 John 25 3
The interviewer gave me a hint that the answer had to do with a self join where the ranking is determined by comparing the score of a record to the rest of the scores by doing a self-join.
Admittedly I am stumped, so I thought I would ask the experts for help.
- dtguin
April 6, 2010 at 2:08 pm
OK, found the answer.... well at least how to do the query. I just need to wrap it in a view.
SELECT a1.name, a1.score, COUNT(a2.score) Rank
FROM Test a1, Test a2
WHERE a1.score < a2.score
GROUP BY a1.name, a1.score
ORDER BY a1.score DESC;
Thanks.
- dtguin
April 6, 2010 at 2:51 pm
Self-join not necessary, just use one of the new (to sql 2005) ranking functions:
-- See how this starts off with a table and data in it?
-- If you had provided us the data in this format,
-- it would have made things easier for all of the
-- volunteers on this site to help you out.
DECLARE @Test TABLE (
id int identity PRIMARY KEY CLUSTERED,
name varchar(25) ,
score int
)
insert into @Test
select 'John',25 UNION ALL
select 'Mary',65 UNION ALL
select 'Dave',35
select id,
name,
score,
RankNbr = RANK() OVER (Order by Score DESC)
from @Test
order by RankNbr
Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes
April 6, 2010 at 2:56 pm
Your question didn't mention what to do if there are two entries with the same value. There are two options:
1. All values that tie have the same rank, and the next value has the next sequential rank #.
2. All values that tie have the same rank, and the next value has the rank # that would have been assigned had there not been any ties.
#1 is known as a dense rank, and you would use the DENSE_RANK() function.
#2 is known as a rank, and you would use the RANK() function.
To show the difference, let's add a duplicate score to the above example, then run both:
-- See how this starts off with a table and data in it?
-- If you had provided us the data in this format,
-- it would have made things easier for all of the
-- volunteers on this site to help you out.
DECLARE @Test TABLE (
id int identity PRIMARY KEY CLUSTERED,
name varchar(25) ,
score int
)
insert into @Test
select 'John',25 UNION ALL
select 'Mary',65 UNION ALL
select 'Dave',35 UNION ALL
select 'Joseph', 35 --<< add this new data, with a duplicate score
select id,
name,
score,
RankNbr = RANK() OVER (Order by Score DESC),
DenseRankNbr = DENSE_RANK() OVER (Order by Score DESC)
from @Test
order by RankNbr
Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes
April 6, 2010 at 3:19 pm
dtguin (4/6/2010)
OK, found the answer.... well at least how to do the query. I just need to wrap it in a view.SELECT a1.name, a1.score, COUNT(a2.score) Rank
FROM Test a1, Test a2
WHERE a1.score < a2.score
GROUP BY a1.name, a1.score
ORDER BY a1.score DESC;
Thanks.
- dtguin
I agree... a self join isn't necessary. In fact, what you've created is a rather deadly "Triangular Join" that will eat the face off your server and your I/O system in the face of even mild scalability requirements.
Please see the following article on Triangular Joins and why they're so bad.
http://www.sqlservercentral.com/articles/T-SQL/61539/
Wayne suggested a couple of good alternatives.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 5 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply