Sequence? Increment?

  • Hi all, I am not sure how to accomplish this. Its something like an increment or rank but I haven't had any success with any of the ranking functions ...

    Here is my data set sorted by year/mon/day:

    Year Mon Day Group

    ----- --- --- -----

    2009 04 09 C

    2009 04 10 C

    ...

    2009 07 14 C

    2009 08 15 B

    2009 08 16 B

    ...

    All I want to do is to add a column to produce an increment along the group dimension. I.e

    Year Mon Day Group Column

    ----- --- --- ------ -------

    2009 04 09 C 1

    2009 04 10 C 1

    ...

    2009 07 14 C 1

    2009 08 15 B 2

    2009 08 16 B 2

    ...

    It seems dense rank is close but since I have to put an order by clause the result flips the values I want. In addition, there could be more than 2 groups. Any thoughts?

    Thanks!

  • You have a nice table somewhere to experiment with. We don't. Please see the article at the first link in my signature below to help us help you much more quickly. Thanks.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Based on the very limited data presented it seems that you could use:

    Row_Number() Over(Partition By Group Order By Group Desc, Year, Month, Day)

    I did the Order BY Group DESC because you have the C group with 1 and the B group with 2, but didn't give any criteria as to why it is that way.

  • I thought using a column in the order by that's included in the partition is irrelevent as the partitioned column by definition will be the same value for all rows within that partition?

    Again, not entirely sure what the OP's after, but this looks like it might fit the expected output:

    DENSE_RANK() OVER (ORDER by [group] desc)

  • First off, apologies for not providing a "real" data set - I can understand how that would get annoying 🙂 So here is a second shot at my "real" data:

    CREATE TABLE TestTable

    (

    Year varchar(4),

    Month varchar(2),

    Day varchar(2),

    Grp varchar(4)

    )

    INSERT INTO TestTable (Year,Month,Day,Grp)

    SELECT '2009','04','09','C' UNION ALL

    SELECT '2009','04','10','C' UNION ALL

    SELECT '2009','04','11','C' UNION ALL

    SELECT '2009','04','13','C' UNION ALL

    SELECT '2009','04','20','C' UNION ALL

    SELECT '2009','04','29','B' UNION ALL

    SELECT '2009','05','09','B' UNION ALL

    SELECT '2009','05','15','B' UNION ALL

    SELECT '2009','05','20','B' UNION ALL

    SELECT '2009','05','21','B' UNION ALL

    SELECT '2009','05','23','D' UNION ALL

    SELECT '2009','05','29','D' UNION ALL

    SELECT '2009','06','09','D' UNION ALL

    SELECT '2009','06','15','D'

    So my data is essentially a sparse time series sorted by Year/Month/Day with group designations for a particular employee. Employees can shift groups often and essentially want to label each team in the sequence. So in this example C would be team 1, B is team 2 and D is team 3.

    Unfortunately the dense rank solution wouldn't work in general since an employee can be in multiple groups with no inherit "order". The goal table would be the following:

    SELECT YEAR, MONTH, DAY, Grp,

    CASE WHEN Grp = 'C' THEN 1

    WHEN Grp = 'B' THEN 2

    WHEN Grp = 'D' THEN 3

    ELSE 0

    END AS Sequence

    FROM TestTable

    Of course not using case statements ... Hopefully that makes things a little more clear.

    Thanks!

  • nice post!

  • HowardW (2/22/2010)


    I thought using a column in the order by that's included in the partition is irrelevent as the partitioned column by definition will be the same value for all rows within that partition?

    Again, not entirely sure what the OP's after, but this looks like it might fit the expected output:

    DENSE_RANK() OVER (ORDER by [group] desc)

    Yes, this looks like the answer although I'm not sure how you would guarantee that Group C should be 1 in this set. Why should C be 1?

  • Well C is the first group in the list (relative to date) ... hence number 1. Similarly for B and D, they are the second and third "distinct" groups over time, so I wanted B to be identified as 2 and D as 3.

    I think if I used an OVER (ORDER BY Grp DESC) clause that would say D is first, C is second and B is third.

  • emblabac (2/22/2010)


    Well C is the first group in the list (relative to date) ... hence number 1. Similarly for B and D, they are the second and third "distinct" groups over time, so I wanted B to be identified as 2 and D as 3.

    I think if I used an OVER (ORDER BY Grp DESC) clause that would say D is first, C is second and B is third.

    If you are not explicitly using date as an ordering key in your query, then SQL does not necessarily return them in that order.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • emblabac (2/22/2010)


    Well C is the first group in the list (relative to date) ... hence number 1. Similarly for B and D, they are the second and third "distinct" groups over time, so I wanted B to be identified as 2 and D as 3.

    I think if I used an OVER (ORDER BY Grp DESC) clause that would say D is first, C is second and B is third.

    As Barry said, if your ordering is based on the date then you have to include date. It sounds like what you need to do is find out the order of the groups based on start date and then assign that number to each member of the group. Something like this (I don't guarantee that this is the best way):

    WITH cteRanks

    AS (SELECT

    RANK() OVER (ORDER BY MIN(CONVERT(SMALLDATETIME, (CONVERT(CHAR(4), YEAR) + CONVERT(varchar(2), MONTH) + CONVERT(varchar(2), DAY))))) AS ranking ,

    grp

    FROM

    TestTable

    GROUP BY

    grp)

    SELECT

    t.*,

    C1.ranking

    FROM

    cteRanks AS C1 JOIN

    testtable AS T ON C1.grp = t.grp

    ORDER BY

    C1.ranking

  • This seemed to work! Thanks Jack!

Viewing 11 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply