group by part of a multi column primary key

  • create table t1 (col1 int not null, col2 int not null, col3 varchar(10))

    alter table t1 add constraint PK_1 Primary Key ([col1],[col2])

    insert into t1 values (1,1,'oneone')

    insert into t1 values (1,2,'onetwo')

    insert into t1 values (1,3,'onethree')

    insert into t1 values (1,4,'onefour')

    insert into t1 values (2,1,'twoone')

    insert into t1 values (2,2,'twotwo')

    insert into t1 values (2,3,'twothree')

    insert into t1 values (2,4,'twofour')

    insert into t1 values (2,5,'twofive')

    select b.*

    from

    (select col1, max(col2) as col2 from t1 group by col1) as fakePrimaryKey

    join t1 as b on fakePrimaryKey.col1 = b.col1 and fakePrimaryKey.col2= b.col2

    /* This is the data set that I want returned:

    The entire row of data where the value of column 2 is the maximum for each distinct value in col1.

    1, 4, 'onefour'

    2, 5, 'twofive'

    I can do this by joining the table to itself using the following logic:

    In essence, creating a derived table that holds primary key values and then joining that back to the table on the primary key columns in order to retrieve the rest of the data in the row.

    BUT, I was hoping that there was a more efficient way to accomplish the same thing... Is there some way to do this that does not require joining the table to itself? a having clause? a group by clause?

    Thanks,

    J

    */

  • I may be over-simplifying this, but:

    
    
    SELECTcol1,
    Max(col2)
    FROMt1
    GROUP BYcol1

    --SJTerrill

    Ah, I'm definitely over-simplifying. I believe you'll have to resort to subquery to return the contents of the whole row.

    Edited by - TheWildHun on 06/12/2003 12:49:27 PM

  • Here's a quick solution:

    
    
    SELECTt1.col1,
    Max(t1.col2) col2
    INTO#tmp
    FROMt1
    GROUP BYt1.col1
    
    
    SELECT#tmp.col1,
    #tmp.col2,
    t1.col3
    FROM#tmp
    JOIN
    t1
    ON #tmp.col1 = t1.col1
    AND #tmp.col2 = t1.col2

    DROP TABLE#tmp

    --SJTerrill

    Edited by - TheWildHun on 06/12/2003 12:56:09 PM

  • I think that it may be more of a data storage problem than a query problem...

    Here's another idea...I should have two tables with a 1 to 1 relationship (on id):

    create table t1(col1 int not null, col2 int not null, id int)

    create table t1Extended ( id int Primary Key, data1 int, data2 int, data3 int)

    select c.*

    from

    (select col1, max(col2) as col2 from t1 group by col1) as fakePrimaryKey

    join t1 as b on fakePrimaryKey.col1 = b.col1 and fakePrimaryKey.col2= b.col2

    join t1Extended c on c.id = b.id

    Do you think that is a faster way of retrieving the data than having all the data in one table?

    -J

    Edited by - jraha on 06/12/2003 1:19:45 PM

    Edited by - jraha on 06/12/2003 1:23:31 PM

  • Actually, I like your original example. For your requirement of returning all columns of the candidate rows I can't think of a better solution.


    Cheers,
    - Mark

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply