How to add a "Rank" column to a view on the fly using a self join.

  • In an interview yesterday I was given this question:

    Given a table "Test" that has three columns

    id (int, identity key)

    name (varchar(25) )

    score (int)

    Create a view that displays id, name, score and rank, where the rank is a sequential numbering of the scores in descending order.

    Here is some sample data:

    id name score

    == ===== =====

    1 John 25

    2 Mary 65

    3 Dave 35

    Here is the output based on the above data:

    id name score rank

    == ==== ==== ===

    2 Mary 65 1

    3 Dave 35 2

    1 John 25 3

    The interviewer gave me a hint that the answer had to do with a self join where the ranking is determined by comparing the score of a record to the rest of the scores by doing a self-join.

    Admittedly I am stumped, so I thought I would ask the experts for help.

    - dtguin

  • OK, found the answer.... well at least how to do the query. I just need to wrap it in a view.

    SELECT a1.name, a1.score, COUNT(a2.score) Rank

    FROM Test a1, Test a2

    WHERE a1.score < a2.score

    GROUP BY a1.name, a1.score

    ORDER BY a1.score DESC;

    Thanks.

    - dtguin

  • Self-join not necessary, just use one of the new (to sql 2005) ranking functions:

    -- See how this starts off with a table and data in it?

    -- If you had provided us the data in this format,

    -- it would have made things easier for all of the

    -- volunteers on this site to help you out.

    DECLARE @Test TABLE (

    id int identity PRIMARY KEY CLUSTERED,

    name varchar(25) ,

    score int

    )

    insert into @Test

    select 'John',25 UNION ALL

    select 'Mary',65 UNION ALL

    select 'Dave',35

    select id,

    name,

    score,

    RankNbr = RANK() OVER (Order by Score DESC)

    from @Test

    order by RankNbr

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • Your question didn't mention what to do if there are two entries with the same value. There are two options:

    1. All values that tie have the same rank, and the next value has the next sequential rank #.

    2. All values that tie have the same rank, and the next value has the rank # that would have been assigned had there not been any ties.

    #1 is known as a dense rank, and you would use the DENSE_RANK() function.

    #2 is known as a rank, and you would use the RANK() function.

    To show the difference, let's add a duplicate score to the above example, then run both:

    -- See how this starts off with a table and data in it?

    -- If you had provided us the data in this format,

    -- it would have made things easier for all of the

    -- volunteers on this site to help you out.

    DECLARE @Test TABLE (

    id int identity PRIMARY KEY CLUSTERED,

    name varchar(25) ,

    score int

    )

    insert into @Test

    select 'John',25 UNION ALL

    select 'Mary',65 UNION ALL

    select 'Dave',35 UNION ALL

    select 'Joseph', 35 --<< add this new data, with a duplicate score

    select id,

    name,

    score,

    RankNbr = RANK() OVER (Order by Score DESC),

    DenseRankNbr = DENSE_RANK() OVER (Order by Score DESC)

    from @Test

    order by RankNbr

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • dtguin (4/6/2010)


    OK, found the answer.... well at least how to do the query. I just need to wrap it in a view.

    SELECT a1.name, a1.score, COUNT(a2.score) Rank

    FROM Test a1, Test a2

    WHERE a1.score < a2.score

    GROUP BY a1.name, a1.score

    ORDER BY a1.score DESC;

    Thanks.

    - dtguin

    I agree... a self join isn't necessary. In fact, what you've created is a rather deadly "Triangular Join" that will eat the face off your server and your I/O system in the face of even mild scalability requirements.

    Please see the following article on Triangular Joins and why they're so bad.

    http://www.sqlservercentral.com/articles/T-SQL/61539/

    Wayne suggested a couple of good alternatives.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 5 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply