maximum rows without duplication

  • Consider this table:

    declare @a table (a int,b int,c char(1))

    insert into @a

    select 1,17,'a' union all

    select 1,15,'b' union all

    select 2,15,'c' union all

    select 3,14,'d' union all

    select 4,13,'e' union all

    select 3,13,'f' union all

    select 5,12,'g' union all

    select 6,12,'h' union all

    select 6,11,'i' union all

    select 7,10,'j' union all

    select 8,9,'k' union all

    select 8,10,'l'

    My goal is to get the maximum rows in my result without any duplicate values within columns a or b across all rows. In this case I should get 8 rows. I tried these two queries below but I only get 6 or 7 rows respectively due to an eliminated row causing a qualifying row to be eliminated because they have the same value in a particular column. Thanks!!

    ------------------------------------------------------------------------------------

    SELECT c

    FROM

    (SELECT ROW_NUMBER() OVER (PARTITION BY a ORDER BY c asc) Rank_a

    ,ROW_NUMBER() OVER (PARTITION BY b ORDER BY c asc) Rank_b

    ,c

    FROM @a) t

    WHERE rank_a = 1

    AND rank_b = 1

    ------------------------------------------------------------------------------------

    SELECT c

    FROM

    (SELECT ROW_NUMBER() OVER (PARTITION BY b ORDER BY c asc) Rank_b,c

    FROM

    (SELECT ROW_NUMBER() OVER (PARTITION BY a ORDER BY c asc) Rank_a,b,c

    FROM @a) t

    WHERE rank_a = 1) u

    WHERE rank_b = 1

    ------------------------------------------------------------------------------------

  • Here's one way to do it:

    declare @a table (a int,b int,c char(1))

    insert into @a

    select 1,17,'a' union all

    select 1,15,'b' union all

    select 2,15,'c' union all

    select 3,14,'d' union all

    select 4,13,'e' union all

    select 3,13,'f' union all

    select 5,12,'g' union all

    select 6,12,'h' union all

    select 6,11,'i' union all

    select 7,10,'j' union all

    select 8,9,'k' union all

    select 8,10,'l'

    SELECT distinct a1.a, a1.b from @a a1 where a1.b in

    (select top 1 a2.b from @a a2 where a1.a = a2.a)

  • OH! I re-read and if you're only looking to get 'c' column values here's an approach that can do it using a CTE:

    declare @a table (a int,b int,c char(1))

    insert into @a

    select 1,17,'a' union all

    select 1,15,'b' union all

    select 2,15,'c' union all

    select 3,14,'d' union all

    select 4,13,'e' union all

    select 3,13,'f' union all

    select 5,12,'g' union all

    select 6,12,'h' union all

    select 6,11,'i' union all

    select 7,10,'j' union all

    select 8,9,'k' union all

    select 8,10,'l';

    WITH TOP_CTE

    (

    a, b, c

    )

    AS

    (

    SELECT distinct a1.a, a1.b, a1.c from @a a1 where a1.b in

    (select top 1 a2.b from @a a2 where a1.a = a2.a)

    )

    SELECT c from TOP_CTE;

  • your query returns rows g and h which both have 12 in column b.

    Thanks for the speedy reply though!

  • Are you looking for uniqueness for:

    1. a AND b (a,b) = 8 rows, which includes the one you said it doesn't

    2. a OR b = 1 row (7,10,'j')

    3. a UNION b = 2, 4, 5, 7, 17, 14, 11, 9

  • Sorry for not being clear.

    None of the above. It is not and/or/union.

    In my resultset I need all rows to have unique values in column a. There cannot be two rows with the same value in column a, regardless of what is in column b. At the same time I need all rows to have unique values in column b. There cannot be two rows with the same value in column b, regardless of what is in column a. If a row has a=b, that is ok.

    Thanks!

  • Maybe code below will do what you need.

    I haven't tried for efficiency, just to get the result you want.

    select

    a.a, a.b, a.c

    from @a a

    left outer join (

    select b

    from @a a2

    group by b

    having count(*) > 1

    ) AS b_multi on

    a.b = b_multi.b

    where

    ((b_multi.b is null)

    or (not exists(select 1 from @a a3 where a3.a = a.a and a3.b <> a.b)))

    order by

    1

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • Your code works for this particuar set of rows. If I add two more rows to my table, duplicates appear:

    select 18,19,'m' union all <--- new row

    select 18,20,'n' union all <--- new row

    select 1,17,'a' union all

    select 1,15,'b' union all

    select 2,15,'c' union all

    select 3,14,'d' union all

    select 4,13,'e' union all

    select 3,13,'f' union all

    select 5,12,'g' union all

    select 6,12,'h' union all

    select 6,11,'i' union all

    select 7,10,'j' union all

    select 8,9,'k' union all

    select 8,10,'l';

    I am working with hundreds of thousands ot rows. The requirement is to get the maximum amount of rows in all instances without duplicating values within a column.

    Thanks!!

  • select

    a.a, a.b, a.c

    from @a a

    left outer join (

    select b

    from @a a2

    group by b

    having count(*) > 1

    ) AS b_multi on

    a.b = b_multi.b

    where

    ((b_multi.b is null)

    or (not exists(select 1 from @a a3 where a3.a = a.a and a3.b <> a.b))) and

    not exists(select 1 from @a a4 where a4.a = a.a and a4.b <> a.b)

    order by

    1

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply