Select "first" line from each group

  • Result of Select statement with GROUP BY clause:

    a1b2b3

    a1b4b5

    b1c1c2

    b1c3c4

    b1c5c6

    ...

    The influence of the GROUP BY clause is apparent in column 1.

    I'm calling the elements "a1, b1, c1" so we don't assume they are integers, strings, or dates, or in any particlar order. They are simply grouped as shown above. (In particular cases the columns will be ordered in some manner but that ordering will vary from case to case so I don't want to assume any particular order)

    I'm trying to use t-sql to get the "first" record from each group. Here's what I want:

    a1b2b3

    b1c1c2

    Any ideas how to approach this?

    TIA,

    Bill

  • Here is one way using temporary variable and derived tables. This was quickly of the top of my head, hence there will be better ways.

    set nocount on

    -- setting up base data

    declare @a table (a char(2), b char(2), c char(2))

    insert into @a values ('a1', 'b2', 'b3')

    insert into @a values ('a1', 'b4', 'b5')

    insert into @a values ('b1', 'c1', 'c2')

    insert into @a values ('b1', 'c3', 'c4')

    insert into @a values ('b1', 'c5', 'c6')

    -- puting data into a table with an unique id.

    declare @b-2 table (pk_id int identity, a char(2), b char(2), c char(2))

    insert into @b-2 select * from @a

    select a,b,c from @b-2 d

    inner join (

    select min(pk_id) as pk_id from @b-2 b

    inner join (select distinct a from @b-2) as a

    on a.a = b.a group by a.a

    ) as c on c.pk_id = d.pk_id

    Steven

  • You have to have the group by clause in the insert .. select for it to be correct.

  • Assuming that the specified 5 rows are in a table tmp_a, try this:

    select

    t1.a, t1.min_b, (select min(c) from tmp_a where (a = t1.a) and (b = t1.min_b)) min_c

    from

    (select a, min(b) as min_b from tmp_a group by a) as t1

    order by

    t1.a

    Additional index on (a, b, c) will help, if the table is large.

    HTH,

    Regards,

    g_smilevski.

  • Use the results as a derived table and then just GROUP BY again.

    SELECT Col1, MIN(Col2), MIN(Col3)

    FROM Results

    GROUP By Col1

    --Jonathan



    --Jonathan

  • Jonathan's solution won't work if the minimum values of both columns are not in the same row. It will return unrelated data.

    A solution that will always work, regardless of the number of columns and stuff...

    
    
    SELECT t1.* FROM your_table t1
    WHERE CHECKSUM(*) =
    (SELECT top 1 CHECKSUM(*)
    FROM your_table t2
    WHERE t2.groupby_col = t1.groupby_col)

    Edited by - NPeeters on 10/06/2003 06:31:05 AM

  • Obviously, if you have a unique identifier somewhere in your table, you can do the 'joining' in the subquery on that field (or combination of fields).

  • quote:


    Jonathan's solution won't work if the minimum values of both columns are not in the same row. It will return unrelated data.

    A solution that will always work, regardless of the number of columns and stuff...

    
    
    SELECT t1.* FROM your_table t1
    WHERE CHECKSUM(*) =
    (SELECT top 1 CHECKSUM(*)
    FROM your_table t2
    WHERE t2.groupby_col = t1.groupby_col)

    Edited by - NPeeters on 10/06/2003 06:31:05 AM


    Oops, that's right. You use TOP without CHECKSUM (which is not guaranteed a unique hash), though:

    SELECT Col1, Col2, Col3

    FROM Results r

    WHERE Col2 = (SELECT TOP 1 Col2 FROM Results WHERE Col1 = r.Col1)

    AND Col3 = (SELECT TOP 1 Col3 FROM Results WHERE Col1 = r.Col1)

    Obviously, one would add ORDER BYs to the subqueries...

    --Jonathan



    --Jonathan

  • Without the subquery...

    set nocount on

    -- setting up base data

    declare @a table (a char(2), b char(2), c char(2))

    insert into @a values ('a1', 'b2', 'b3')

    insert into @a values ('a1', 'b4', 'b5')

    insert into @a values ('b1', 'c1', 'c2')

    insert into @a values ('b1', 'c3', 'c4')

    insert into @a values ('b1', 'c5', 'c6')

    select t1.a, t1.b, min(t1.c) as [c_min]

    from @a t1

    inner join ( -- keep only those with the correct a,b pair

    select a, min(b) as [b_min]

    from @a

    group by a

    ) t2 on

    t2.a = t1.a and

    t2.b_min = t1.b

    group by t1.a, t1.b

    order by t1.a, t1.b

    RandyHelpdesk: Perhaps Im not the only one that does not know what you are doing. 😉

  • TRY this:

    select *

    from blah a

    where not exists(select *

    from blah a

    where a.a=b.a AND

    ((a.b=b.b AND a.c>b.c) or (a.b>b.b))

    )

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply