Advanced ranking function

  • This is my schema:

    if object_id('tempdb.dbo.#Countries') is not null drop table #Countries

    go

    create table #Countries (

    Id int identity(1,1),

    Country varchar(30)

    )

    go

    insert into #Countries (Country)

    select 'C1' union all

    select 'C1' union all

    select 'C1' union all

    select 'C1' union all

    select 'C1' union all

    select 'C2' union all

    select 'C2' union all

    select 'C2' union all

    select 'C2' union all

    select 'C3' union all

    select 'C3'

    And this is the desired output:

    C11

    C11

    C11

    C12

    C12

    C23

    C23

    C23

    C24

    C35

    C35

    The logic is this:

    - increase the value after three records,

    - increase the value if country changes.

    Now I am calculating the value with this query:

    select Country, ((row_number() over (partition by Country order by Country) - 1) / 3) + 1 X

    from #Countries

    order by Country

    But the output is wrong:

    C11

    C11

    C11

    C12

    C12

    C21

    C21

    C21

    C22

    C31

    C31

    In order to get the desired values I run a cursor on my output which goes something like this:

    1. run cursor on all distinct countries ordered by country, but without first country (so I get C2, C3)

    2. in each iteration of the cursor get max value from the previous country (for C2 the max value of previous country is 2)

    3. update the table for the current country so that the value is value + max(value) of previous country.

    The output is then correct, but I would really like to do this witout cursors if possible?

  • Quick and simple solution, should be enough to get you passed this hurdle, let me know if you need any explanations.

    😎

    USE tempdb;

    GO

    SET NOCOUNT ON;

    if object_id('tempdb.dbo.#Countries') is not null drop table #Countries

    go

    create table #Countries (

    Id int identity(1,1),

    Country varchar(30)

    )

    go

    insert into #Countries (Country)

    select 'C1' union all

    select 'C1' union all

    select 'C1' union all

    select 'C1' union all

    select 'C1' union all

    select 'C2' union all

    select 'C2' union all

    select 'C2' union all

    select 'C2' union all

    select 'C3' union all

    select 'C3';

    ;WITH BASE_DATA AS

    (

    SELECT

    C.Id

    ,C.Country

    ,DENSE_RANK() OVER

    (

    ORDER BY C.Country

    )

    + (CEILING(ROW_NUMBER() OVER

    (

    PARTITION BY C.Country

    ORDER BY C.Country

    ) / 3.0)) - 1 AS C_RID

    ,CASE

    WHEN LAG(C.Country,1,C.Country) OVER

    (

    ORDER BY C.Country

    ) <> C.Country THEN 1

    ELSE 0

    END AS C_CNTR

    FROM #Countries C

    )

    SELECT

    BD.Id

    ,BD.Country

    ,BD.C_RID

    + SUM(BD.C_CNTR) OVER

    (

    PARTITION BY (SELECT NULL)

    ORDER BY BD.Country

    ROWS BETWEEN UNBOUNDED PRECEDING

    AND CURRENT ROW

    ) AS COUNTRY_RANK

    FROM BASE_DATA BD;

    Results

    Id Country COUNTRY_RANK

    ---- --------- -------------

    1 C1 1

    2 C1 1

    3 C1 1

    4 C1 2

    5 C1 2

    6 C2 3

    7 C2 3

    8 C2 3

    9 C2 4

    10 C3 5

    11 C3 5

  • Ughhhh... An explanation, yes please 🙂

  • This might be simpler to understand.

    WITH CTE AS(

    SELECT *, (ROW_NUMBER() OVER(PARTITION BY Country ORDER BY Id)+ 2) / 3 AS rowrank

    FROM #Countries

    )

    SELECT Id, Country, DENSE_RANK() OVER( ORDER BY Country, rowrank)

    FROM CTE

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Luis Cazares (5/25/2015)


    This might be simpler to understand.

    WITH CTE AS(

    SELECT *, (ROW_NUMBER() OVER(PARTITION BY Country ORDER BY Id)+ 2) / 3 AS rowrank

    FROM #Countries

    )

    SELECT Id, Country, DENSE_RANK() OVER( ORDER BY Country, rowrank)

    FROM CTE

    I love it!!! Great application of integer math! +1000:-)

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Luis Cazares (5/25/2015)


    This might be simpler to understand.

    WITH CTE AS(

    SELECT *, (ROW_NUMBER() OVER(PARTITION BY Country ORDER BY Id)+ 2) / 3 AS rowrank

    FROM #Countries

    )

    SELECT Id, Country, DENSE_RANK() OVER( ORDER BY Country, rowrank)

    FROM CTE

    Ah, very clever!

    ____________________________________________________

    Deja View - The strange feeling that somewhere, sometime you've optimised this query before

    How to get the best help on a forum

    http://www.sqlservercentral.com/articles/Best+Practices/61537
  • Thank you guys.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • _simon_ (5/25/2015)


    Ughhhh... An explanation, yes please 🙂

    The DENSE_RANK provides an one based incremental value for each country in an alphabetical order, the ROW_NUMBER / 3 adds the value of one for every group of three entries. This satisfies the requirements if there is only one country, to connect the list values a running total is incremented (using the LAG function) for every change of countries in the list (the last SUM).

    😎

  • Luis Cazares (5/25/2015)


    This might be simpler to understand.

    WITH CTE AS(

    SELECT *, (ROW_NUMBER() OVER(PARTITION BY Country ORDER BY Id)+ 2) / 3 AS rowrank

    FROM #Countries

    )

    SELECT Id, Country, DENSE_RANK() OVER( ORDER BY Country, rowrank)

    FROM CTE

    Quite neat and compact solution Luis, very much like my initial thought. There is a caveat though in the additional sort needed as the sort order is different inside and outside the cte, has quite an impact on the performance. The other (lengthy) solution has only one sort operator which can be eliminated with a single index, no such options for this one.

    😎

  • @luis Cazares:

    Thank you guys.

    No no, thank you 🙂

    @Eirikur Eiriksson: I learned a lot from your solution (never saw 'LAG' and 'ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW' function/syntax before), thanks for that. But your query is returning invalid results for the table:

    insert into #Countries (Country)

    select 'C1' union all

    select 'C2' union all

    select 'C2' union all

    select 'C2' union all

    select 'C2' union all

    select 'C2' union all

    select 'C2' union all

    select 'C2' union all

    select 'C2' union all

    select 'C3' union all

    select 'C3'

    I get:

    1C11

    2C23

    3C23

    4C23

    5C24

    6C24

    7C24

    8C25

    9C25

    10C35

    11C35

    Instead of:

    1C11

    2C22

    3C22

    4C22

    5C23

    6C23

    7C23

    8C24

    9C24

    10C35

    11C35

  • Seems like you are using Rank() instead of Dense_Rank()

  • Eirikur Eiriksson (5/25/2015)


    Luis Cazares (5/25/2015)


    This might be simpler to understand.

    ...has only one sort operator which can be eliminated with a single index...

    [font="Comic Sans MS"]Sinking feeling of reverting back to newbie status - I cannot make heads or tail of that last statement. Can you help by expanding a bit on this topic ?

    Regards[/font]

  • j-1064772 (5/29/2015)


    Eirikur Eiriksson (5/25/2015)


    Luis Cazares (5/25/2015)


    This might be simpler to understand.

    ...has only one sort operator which can be eliminated with a single index...

    [font="Comic Sans MS"]Sinking feeling of reverting back to newbie status - I cannot make heads or tail of that last statement. Can you help by expanding a bit on this topic ?

    Regards[/font]

    Eirikur is simply pointing out the fact that Luis's solution is ordering by an expression (rowrank) in the final select statement (see the order by portion of the DENSE_RANK function).

    As a result, the optimizer has added an expensive looking sort operator to the execution plan... Since you can't index an expression, there is no way to index your way out of that sort operator in this situation. Eirikur's solution, on the other hand doesn't order by any expressions and can, therefore, use an index to eliminate the sort operator.

    That said... A covering index with Country & ID as the key columns (in that order) would exactly mimic the order needed to eliminate the sort operator in Luis's solution.

    The sort operator would still be there and would continue to look expensive but I'd expect it's impact on the actual execution to be minimal to nonexistent.

    Edit... The stricken out assumption not supported by test... See the following post...

  • Testing the assumption that a covering index would negate the impact of the sort operator cause by ordering on an expression (rowrank).

    The test table... Just under 1/2 M rows similar to the OP's test data...

    IF OBJECT_ID('tempdb..#Countries') IS NOT NULL

    DROP TABLE #Countries;

    ;WITH Data (c) AS (

    SELECT d.c FROM ( VALUES

    ('C1'),('C1'),('C1'),('C1'),('C1'),('C1'),('C1'),('C1'),('C1'),('C1'),('C1'),('C1'),('C1'),('C1'),('C1'),('C1'),('C1'),

    ('C2'),('C2'),('C2'),('C2'),('C2'),('C2'),('C2'),('C2'),('C2'),('C2'),('C2'),('C2'),('C2'),('C2'),('C2'),('C2'),('C2'),

    ('C3'),('C3'),('C3'),('C3'),('C3'),('C3'),('C3'),('C3'),('C3'),('C3'),('C3'),('C3'),('C3'),('C3'),('C3'),('C3'),('C3'),

    ('C4'),('C4'),('C4'),('C4'),('C4'),('C4'),('C4'),('C4'),('C4'),('C4'),('C4'),('C4'),('C4'),('C4'),('C4'),('C4'),('C4'),

    ('C5'),('C5'),('C5'),('C5'),('C5'),('C5'),('C5'),('C5'),('C5'),('C5'),('C5'),('C5'),('C5'),('C5'),('C5'),('C5'),('C5'),

    ('C6'),('C6'),('C6'),('C6'),('C6'),('C6'),('C6'),('C6'),('C6'),('C6'),('C6'),('C6'),('C6'),('C6'),('C6'),('C6'),('C6'),

    ('C7'),('C7'),('C7'),('C7'),('C7'),('C7'),('C7'),('C7'),('C7'),('C7'),('C7'),('C7'),('C7'),('C7'),('C7'),('C7'),('C7'),

    ('C8'),('C8'),('C8'),('C8'),('C8'),('C8'),('C8'),('C8'),('C8'),('C8'),('C8'),('C8'),('C8'),('C8'),('C8'),('C8'),('C8'),

    ('C9'),('C9'),('C9'),('C9'),('C9'),('C9'),('C9'),('C9'),('C9'),('C9'),('C9'),('C9'),('C9'),('C9'),('C9'),('C9'),('C9'),

    ('C10'),('C10'),('C10'),('C10'),('C10'),('C10'),('C10'),('C10'),('C10'),('C10'),('C10'),('C10'),('C10'),('C10'),('C10'),('C10'),('C10'),

    ('C11'),('C11'),('C11'),('C11'),('C11'),('C11'),('C11'),('C11'),('C11'),('C11'),('C11'),('C11'),('C11'),('C11'),('C11'),('C11'),('C11'),

    ('C12'),('C12'),('C12'),('C12'),('C12'),('C12'),('C12'),('C12'),('C12'),('C12'),('C12'),('C12'),('C12'),('C12'),('C12'),('C12'),('C12'),

    ('C13'),('C13'),('C13'),('C13'),('C13'),('C13'),('C13'),('C13'),('C13'),('C13'),('C13'),('C13'),('C13'),('C13'),('C13'),('C13'),('C13'),

    ('C14'),('C14'),('C14'),('C14'),('C14'),('C14'),('C14'),('C14'),('C14'),('C14'),('C14'),('C14'),('C14'),('C14'),('C14'),('C14'),('C14'),

    ('C15'),('C15'),('C15'),('C15'),('C15'),('C15'),('C15'),('C15'),('C15'),('C15'),('C15'),('C15'),('C15'),('C15'),('C15'),('C15'),('C15'),

    ('C16'),('C16'),('C16'),('C16'),('C16'),('C16'),('C16'),('C16'),('C16'),('C16'),('C16'),('C16'),('C16'),('C16'),('C16'),('C16'),('C16'),

    ('C17'),('C17'),('C17'),('C17'),('C17'),('C17'),('C17'),('C17'),('C17'),('C17'),('C17'),('C17'),('C17'),('C17'),('C17'),('C17'),('C17'),

    ('C18'),('C18'),('C18'),('C18'),('C18'),('C18'),('C18'),('C18'),('C18'),('C18'),('C18'),('C18'),('C18'),('C18'),('C18'),('C18'),('C18'),

    ('C19'),('C19'),('C19'),('C19'),('C19'),('C19'),('C19'),('C19'),('C19'),('C19'),('C19'),('C19'),('C19'),('C19'),('C19'),('C19'),('C19'),

    ('C20'),('C20'),('C20'),('C20'),('C20'),('C20'),('C20'),('C20'),('C20'),('C20'),('C20'),('C20'),('C20'),('C20'),('C20'),('C20'),('C20'),

    ('C21'),('C21'),('C21'),('C21'),('C21'),('C21'),('C21'),('C21'),('C21'),('C21'),('C21'),('C21'),('C21'),('C21'),('C21'),('C21'),('C21'),

    ('C22'),('C22'),('C22'),('C22'),('C22'),('C22'),('C22'),('C22'),('C22'),('C22'),('C22'),('C22'),('C22'),('C22'),('C22'),('C22'),('C22'),

    ('C23'),('C23'),('C23'),('C23'),('C23'),('C23'),('C23'),('C23'),('C23'),('C23'),('C23'),('C23'),('C23'),('C23'),('C23'),('C23'),('C23'),

    ('C24'),('C24'),('C24'),('C24'),('C24'),('C24'),('C24'),('C24'),('C24'),('C24'),('C24'),('C24'),('C24'),('C24'),('C24'),('C24'),('C24'),

    ('C25'),('C25'),('C25'),('C25'),('C25'),('C25'),('C25'),('C25'),('C25'),('C25'),('C25'),('C25'),('C25'),('C25'),('C25'),('C25'),('C25'),

    ('C26'),('C26'),('C26'),('C26'),('C26'),('C26'),('C26'),('C26'),('C26'),('C26'),('C26'),('C26'),('C26'),('C26'),('C26'),('C26'),('C26'),

    ('C27'),('C27'),('C27'),('C27'),('C27'),('C27'),('C27'),('C27'),('C27'),('C27'),('C27'),('C27'),('C27'),('C27'),('C27'),('C27'),('C27'),

    ('C28'),('C28'),('C28'),('C28'),('C28'),('C28'),('C28'),('C28'),('C28'),('C28'),('C28'),('C28'),('C28'),('C28'),('C28'),('C28'),('C28'),

    ('C29'),('C29'),('C29'),('C29'),('C29'),('C29'),('C29'),('C29'),('C29'),('C29'),('C29'),('C29'),('C29'),('C29'),('C29'),('C29'),('C29'),

    ('C30'),('C30'),('C30'),('C30'),('C30'),('C30'),('C30'),('C30'),('C30'),('C30'),('C30'),('C30'),('C30'),('C30'),('C30'),('C30'),('C30'),

    ('C31'),('C31'),('C31'),('C31'),('C31'),('C31'),('C31'),('C31'),('C31'),('C31'),('C31'),('C31'),('C31'),('C31'),('C31'),('C31'),('C31'),

    ('C32'),('C32'),('C32'),('C32'),('C32'),('C32'),('C32'),('C32'),('C32'),('C32'),('C32'),('C32'),('C32'),('C32'),('C32'),('C32'),('C32'),

    ('C33'),('C33'),('C33'),('C33'),('C33'),('C33'),('C33'),('C33'),('C33'),('C33'),('C33'),('C33'),('C33'),('C33'),('C33'),('C33'),('C33'),

    ('C34'),('C34'),('C34'),('C34'),('C34'),('C34'),('C34'),('C34'),('C34'),('C34'),('C34'),('C34'),('C34'),('C34'),('C34'),('C34'),('C34'),

    ('C35'),('C35'),('C35'),('C35'),('C35'),('C35'),('C35'),('C35'),('C35'),('C35'),('C35'),('C35'),('C35'),('C35'),('C35'),('C35'),('C35'),

    ('C36'),('C36'),('C36'),('C36'),('C36'),('C36'),('C36'),('C36'),('C36'),('C36'),('C36'),('C36'),('C36'),('C36'),('C36'),('C36'),('C36'),

    ('C37'),('C37'),('C37'),('C37'),('C37'),('C37'),('C37'),('C37'),('C37'),('C37'),('C37'),('C37'),('C37'),('C37'),('C37'),('C37'),('C37'),

    ('C38'),('C38'),('C38'),('C38'),('C38'),('C38'),('C38'),('C38'),('C38'),('C38'),('C38'),('C38'),('C38'),('C38'),('C38'),('C38'),('C38'),

    ('C39'),('C39'),('C39'),('C39'),('C39'),('C39'),('C39'),('C39'),('C39'),('C39'),('C39'),('C39'),('C39'),('C39'),('C39'),('C39'),('C39'),

    ('C40'),('C40'),('C40'),('C40'),('C40'),('C40'),('C40'),('C40'),('C40'),('C40'),('C40'),('C40'),('C40'),('C40'),('C40'),('C40'),('C40')

    ) d (c)

    )

    SELECT

    ISNULL(ROW_NUMBER() OVER (ORDER BY d1.c), 0) AS ID,

    d1.c AS Country

    INTO #Countries

    FROM Data d1, Data d2;

    Add the covering index that is the subject of the test. In this case, no reason to not just make it a clustered primary key...

    ALTER TABLE #Countries ADD CONSTRAINT pk_tmpCountries PRIMARY KEY CLUSTERED (Country, ID);

    The test script (modeled after Luis's solution)

    ;WITH cte AS (

    SELECT

    c.ID,

    c.Country,

    (ROW_NUMBER() OVER (PARTITION BY c.Country ORDER BY c.Country) + 2) / 3 AS RowRank

    FROM

    #Countries c

    )

    SELECT

    cte.ID,

    cte.Country,

    DENSE_RANK() OVER (ORDER BY cte.Country, cte.RowRank) AS CountryRank

    FROM

    cte

    All tests were executed with "Discard results after execution" turned on to eliminate display rendering from the execution times...

    The 1st 5 executions were exactly as shown above...

    The 2nd 5 were were executed with ", cte.RowRank" removed... (all else identical)

  • Great... More missing posts... :crazy:

    If anyone w/ admin privileges can find my last post on this thread (index test w/ results), please de-spam it... Because, I'm not rewriting it...

Viewing 15 posts - 1 through 15 (of 20 total)

You must be logged in to reply to this topic. Login to reply