Select Distinct for 4 columns but return all columns.. how?

  • dear friends,

    I have a table with too manly fields. there are duplicate records in in rows BUT ONLY for some fields.

    I need to use DISTINCT on 4 columns and then return all rows.

    or at least I should get ID value of records in result so I can real all columns with code.

    lets say:

    col1 col2 col3 col4 col5 col6

    a1 AA BB a4 CC a6

    b1 b2 b3 b4 CC b6

    c1 AA BB c4 CC c6

    result must be:

    a1 AA BB a4 CC a6

    b1 b2 b3 b4 CC b6

    here's test code:

    CREATE TABLE [dbo].[Table_2](

    [ID] [int] IDENTITY(1,1) NOT NULL,

    [col1] [nvarchar](50) NULL,

    [col2] [nvarchar](50) NULL,

    [col3] [nvarchar](50) NULL,

    [col4] [nvarchar](50) NULL,

    [col5] [nvarchar](50) NULL,

    [col6] [nvarchar](50) NULL

    ) ON [PRIMARY]

    insert into table_1 (col1,col2,col3,col4,col5,col6) values ('a1','AA','BB','a4','CC','a6')

    insert into table_1 (col1,col2,col3,col4,col5,col6) values ('b1','b2','b3','b4','b5','b6')

    insert into table_1 (col1,col2,col3,col4,col5,col6) values ('c1','AA','BB','c4','CC','c6')

  • Your test preparation code yields errors (you create one table then insert to another).

    If I understood you correctly, you need to identify duplicates by subset of columns,

    and then display all that duplicate rows.

    Grouping functions like COUNT can also be used with OVER() clause and have completely different effect.

    Run this:

    select *,

    Duplicates = COUNT(*) OVER(partition by col2, col3, col5)

    from Table_2

    Result:

    IDcol1col2col3col4col5col6Duplicates

    1a1AABBa4CCa62

    3c1AABBc4CCc62

    2b1b2b3b4b5b61

    So, the solution is:

    SELECT ID, col1,col2,col3,col4,col5,col6

    FROM

    (select *,

    Duplicates = COUNT(*) OVER(partition by col2, col3, col5)

    from Table_2

    ) t

    WHERE t.Duplicates > 1

    Result:

    IDcol1col2col3col4col5col6

    1a1AABBa4CCa6

    3c1AABBc4CCc6

    Inline view is used because OVER() by definition executes just before ORDER BY, so you can't use it in e.g. WHERE because it doesn't exist yet.

    HTH

    _____________________________________________________
    Microsoft Certified Master: SQL Server 2008
    XDetails Addin - for SQL Developers
    blog.sqlxdetails.com - Transaction log myths
  • If you need just first row of each group:

    SELECT ID, col1,col2,col3,col4,col5,col6

    FROM

    (select *,

    Rnr = ROW_NUMBER() OVER(partition by col2, col3, col5 ORDER BY ID)

    from Table_2

    ) t

    WHERE t.Rnr = 1

    Result:

    IDcol1col2col3col4col5col6

    1a1AABBa4CCa6

    2b1b2b3b4b5b6

    _____________________________________________________
    Microsoft Certified Master: SQL Server 2008
    XDetails Addin - for SQL Developers
    blog.sqlxdetails.com - Transaction log myths
  • my friend, thats exactly what I need.

    you helped me so much. thanks..

  • I'm glad to help, you are welcome.

    _____________________________________________________
    Microsoft Certified Master: SQL Server 2008
    XDetails Addin - for SQL Developers
    blog.sqlxdetails.com - Transaction log myths

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply