How to delete duplicates/triplicates etc. except for the latest record based on date?

  • Hi,

    I have a table (t_account_code) with the following three columns - AccountCode, StartDate and EndDate. The table does not have an ID column (I know - it was inherited) and at this point, modifying the table is not an option, so please bear with me. With the combination of three columns, we get a PK.

    This table has multiple entries for the same AccountCode, and I want to delete all duplicate/triplicate/multiple entries EXCEPT for the latest one, based on StartDate.

    First, I've created a temp table with all account codes (unique) and their count, if greater than 1.

    I've tried the following query, first to do a select on the accounts that occur twice:

    select * from t_account_code tac1

    where AccountCode in (select AccountCode from #Tmp_accountcodes where count_num = 2)

    and StartDate = (select MIN(StartDate) from t_account_code tac2 where tac2.AccountCode = tac1.AccountCode)

    The distinct values of count_num are 2, 3 and 4.

    This works as intended and gives me the results I want (for duplicate values), but when I change it to a delete statement, I get a syntax error in the first row for the table alias - tac1. Of course, if I delete it, then my query won't work as I'm using tac1 in the sub-query.

    Any solution to this problem will be appreciated, even if it's a different query than the one above. I want to end up with unique AccountCodes in the table, with the latest StartDate, and delete any multiple ones with the older StartDate.

    E.g.

    AccountCode ---- StartDate ---- EndDate

    1234------------ 12/31/2012 ---- 3/30/2013

    1234------------ 3/31/2013 ------4/29/2013

    1234------------ 4/30/2013 -----12/31/2014

    9876------------12/31/2012------3/30/2013

    9876------------3/31/2013 -------5/30/2014

    I want to delete the first two rows for AccountCode '1234' and delete the first row for AccountCode '9876.'

    If an AccountCode occurs only once, then I want to leave it alone.

    Thanks.

  • SQL_beginner1 (6/12/2013)


    Hi,

    I have a table (t_account_code) with the following three columns - AccountCode, StartDate and EndDate. The table does not have an ID column (I know - it was inherited) and at this point, modifying the table is not an option, so please bear with me. With the combination of three columns, we get a PK.

    This table has multiple entries for the same AccountCode, and I want to delete all duplicate/triplicate/multiple entries EXCEPT for the latest one, based on StartDate.

    First, I've created a temp table with all account codes (unique) and their count, if greater than 1.

    I've tried the following query, first to do a select on the accounts that occur twice:

    select * from t_account_code tac1

    where AccountCode in (select AccountCode from #Tmp_accountcodes where count_num = 2)

    and StartDate = (select MIN(StartDate) from t_account_code tac2 where tac2.AccountCode = tac1.AccountCode)

    The distinct values of count_num are 2, 3 and 4.

    This works as intended and gives me the results I want (for duplicate values), but when I change it to a delete statement, I get a syntax error in the first row for the table alias - tac1. Of course, if I delete it, then my query won't work as I'm using tac1 in the sub-query.

    Any solution to this problem will be appreciated, even if it's a different query than the one above. I want to end up with unique AccountCodes in the table, with the latest StartDate, and delete any multiple ones with the older StartDate.

    E.g.

    AccountCode ---- StartDate ---- EndDate

    1234------------ 12/31/2012 ---- 3/30/2013

    1234------------ 3/31/2013 ------4/29/2013

    1234------------ 4/30/2013 -----12/31/2014

    9876------------12/31/2012------3/30/2013

    9876------------3/31/2013 -------5/30/2014

    I want to delete the first two rows for AccountCode '1234' and delete the first row for AccountCode '9876.'

    If an AccountCode occurs only once, then I want to leave it alone.

    Thanks.

    Give this a try in a test environment:

    with DelDups as (

    select

    AccountCode,

    StartDate,

    EndDate,

    rn = row_number() over (partition by AccountCode order by StartDate desc)

    from

    t_account_code

    )

    delete from DelDups

    where rn > 1;

    Edit: Fixed the code.

  • Hi Lynn,

    I get the following error when I parse your script:

    Msg 156, Level 15, State 1, Line 6

    Incorrect syntax near the keyword 'over'.

    I'm using SQL Server 2008 R2.

  • SQL_beginner1 (6/13/2013)


    Hi Lynn,

    I get the following error when I parse your script:

    Msg 156, Level 15, State 1, Line 6

    Incorrect syntax near the keyword 'over'.

    I'm using SQL Server 2008 R2.

    He missed the () after ROW_NUMBER. Make sure you understand this code before you use it. 🙂

    with DelDups as (

    select

    AccountCode,

    StartDate,

    EndDate,

    rn = row_number() over (partition by AccountCode order by StartDate desc)

    from

    t_account_code

    )

    delete from DelDups

    where rn > 1;

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Sean Lange (6/13/2013)


    SQL_beginner1 (6/13/2013)


    Hi Lynn,

    I get the following error when I parse your script:

    Msg 156, Level 15, State 1, Line 6

    Incorrect syntax near the keyword 'over'.

    I'm using SQL Server 2008 R2.

    He missed the () after ROW_NUMBER. Make sure you understand this code before you use it. 🙂

    with DelDups as (

    select

    AccountCode,

    StartDate,

    EndDate,

    rn = row_number() over (partition by AccountCode order by StartDate desc)

    from

    t_account_code

    )

    delete from DelDups

    where rn > 1;

    Thanks, Sean. I can't believe that I missed that. Not enough caffeine last night I guess.

    I did go back and fix my code.

  • Thanks, Sean and Lynn! That worked.

  • Lynn Pettis (6/13/2013)


    Sean Lange (6/13/2013)


    SQL_beginner1 (6/13/2013)


    Hi Lynn,

    I get the following error when I parse your script:

    Msg 156, Level 15, State 1, Line 6

    Incorrect syntax near the keyword 'over'.

    I'm using SQL Server 2008 R2.

    He missed the () after ROW_NUMBER. Make sure you understand this code before you use it. 🙂

    with DelDups as (

    select

    AccountCode,

    StartDate,

    EndDate,

    rn = row_number() over (partition by AccountCode order by StartDate desc)

    from

    t_account_code

    )

    delete from DelDups

    where rn > 1;

    Thanks, Sean. I can't believe that I missed that. Not enough caffeine last night I guess.

    I did go back and fix my code.

    You would have caught it if you had been able to test it. 😉

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply