Remove/Keep record based on group

  • Hi,

    I have the following data:

    declare @Policy table

    (

    Download varchar(20),

    AccountNum varchar(20),

    RootPolicyNum varchar(20),

    PolicyNum varchar(20),

    CurrentStatus varchar(20)

    )

    insert into @Policy (Download, AccountNum, RootPolicyNum, PolicyNum, CurrentStatus)

    select 'EQAutralia', '57126', '1006784', '1006784', 'Renewed'

    UNION ALL

    select 'EQAutralia', '57126', '1006784', '1000194', 'Approved'

    UNION ALL

    select 'EQAutralia', '57126', '1006784', '00XH09000', 'Approved'

    UNION ALL

    select 'EQAutralia', '57126', 'AU120736', 'AU120736', 'Approved'

    UNION ALL

    select 'EQCalifornia', '57126', '1006784', '00XG96301', 'Renewed'

    select * from @Policy

    For every group of Download, AccountNum, and RootPolicyNum, I only want to keep the 'Renewed' CurrentStatus records if they are the only row in a group. So in the example data above, I want to remove the first row (EQAustralia, 57126, 1006784, 1006784, Renewed) since it is part of a group that containins 'Approved' records, but I want to keep the last row (EQCalifornia, 57126, 1006784, 00XG96301, Renewed) because it is the only record for that group and it is CurrentStatus is 'Renewed'

    thanks

    Scott

  • I came up with the following:

    if object_id('tempdb..#Policy') is not null

    set noexec on;

    create table #Policy

    (

    Download varchar(20),

    AccountNum varchar(20),

    RootPolicyNum varchar(20),

    PolicyNum varchar(20),

    CurrentStatus varchar(20)

    );

    set noexec off;

    truncate table #Policy;

    insert into #Policy (Download, AccountNum, RootPolicyNum, PolicyNum, CurrentStatus)

    select 'EQAutralia', '57126', '1006784', '1006784', 'Renewed'

    UNION ALL

    select 'EQAutralia', '57126', '1006784', '1000194', 'Approved'

    UNION ALL

    select 'EQAutralia', '57126', '1006784', '00XH09000', 'Approved'

    UNION ALL

    select 'EQAutralia', '57126', 'AU120736', 'AU120736', 'Approved'

    UNION ALL

    select 'EQCalifornia', '57126', '1006784', '00XG96301', 'Renewed';

    select * from #Policy;

    with Renewals as (

    select

    Download,

    AccountNum,

    RootPolicyNum,

    PolicyNum,

    CurrentStatus,

    cnt = count(*) over (partition by Download, AccountNum, RootPolicyNum)

    from

    #Policy

    )

    --select * from Renewals where cnt > 1 and CurrentStatus = 'Renewed';

    delete from Renewals where cnt > 1 and CurrentStatus = 'Renewed';

    select * from #Policy;

    There is a problem with this code, however. Can there be a chance where you may have two or more records with a CurrentStatus of 'Renewed' for a given RootPolicyNum.

  • Use the GROUP BY and HAVING operators. Try something like this:

    DELETE P

    FROM @Policy P

    JOIN (

    SELECT Download, AccountNum, RootPolicyNum

    FROM @Policy

    GROUP BY Download, AccountNum, RootPolicyNum

    HAVING COUNT(*) > 1

    ) AS X

    ON X.Download = P.Download

    AND X.AccountNum = P.AccountNum

    AND X.RootPolicyNum = P.RootPolicyNum

    WHERE P.CurrentStatus = 'Renewed'

  • I started replying before Lynn's answer posted. Lynn's answer give's you a better estimated execution plan.

  • works, thanks!

  • scottcabral (3/28/2016)


    works, thanks!

    It works as long as there aren't two rows of data for the same Download, AccountNum, RootPolicyNum where CurrentStatus = 'Renewed' for both.

  • Lynn Pettis (3/28/2016)


    scottcabral (3/28/2016)


    works, thanks!

    It works as long as there aren't two rows of data for the same Download, AccountNum, RootPolicyNum where CurrentStatus = 'Renewed' for both.

    I think that can be fixed by replacing the COUNT(*) with a DENSE_RANK().

    WITH Renewals as (

    select

    Download,

    AccountNum,

    RootPolicyNum,

    PolicyNum,

    CurrentStatus,

    dr = DENSE_RANK() OVER (partition by Download, AccountNum, RootPolicyNum ORDER BY CurrentStatus)

    from

    #Policy

    )

    select * from Renewals where dr > 1 and CurrentStatus = 'Renewed';

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • drew.allen (3/28/2016)


    Lynn Pettis (3/28/2016)


    scottcabral (3/28/2016)


    works, thanks!

    It works as long as there aren't two rows of data for the same Download, AccountNum, RootPolicyNum where CurrentStatus = 'Renewed' for both.

    I think that can be fixed by replacing the COUNT(*) with a DENSE_RANK().

    WITH Renewals as (

    select

    Download,

    AccountNum,

    RootPolicyNum,

    PolicyNum,

    CurrentStatus,

    dr = DENSE_RANK() OVER (partition by Download, AccountNum, RootPolicyNum ORDER BY CurrentStatus)

    from

    #Policy

    )

    select * from Renewals where dr > 1 and CurrentStatus = 'Renewed';

    Drew

    What if you wanted to delete one of the 'Renewed' rows if there were two? Don't think this will work.

  • I think the Dense_Rank will work.

    we should never have a scenerio where we have 2 renewals and have to remove just one.

  • This could work too !

    Delete P

    from @Policy P

    left join

    (

    select Download, AccountNum, RootPolicyNum

    from @Policy

    Group by Download, AccountNum,RootPolicyNum

    Having count(distinct CurrentStatus)=1

    and max(CurrentStatus)='Renewed'

    ) Q

    on P.Download=Q.Download

    and P.AccountNum=Q.AccountNum

    and P.RootPolicyNum=Q.RootPolicyNum

    where Q.AccountNum is null

    select *

    from @Policy

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply