Finding duplicate records

  • Hello all,

    I am trying to build a select statement that will 1). find records where a duplicate field exists. 2.) show the ItemNo of records where the Historical data is ' '.

    I have the query which shows me the duplicates but I'm stuck when trying to only show records where the Historical field = ' '.

    Originally I thought I could accomplish this goal with a sub query based off of the original duplicate result set but SQL server returns an error.

    Below you will find a temp table to create and try in your environment.

    Please let me know your suggestions.

    create table #Items

    (

    ItemNovarchar (50),

    SearchNo varchar (50),

    Historical int

    )

    Insert into #Items

    (

    ItemNo , SearchNo ,Historical

    )

    Values

    ('QBGFEP2050 CHD', 'QBGFEP2050 CHD', 009),

    ('QBGFEP2050 CH DIS', 'QBGFEP2050 CHD', ' '),

    ('ATMR10 MFZ','ATMR10 MFZ',002),

    ('ATMR10 MFZ N','ATMR10 MFZ', ' ')

    select * from #Items

    select [SearchNo]

    , Count(*)

    from #Items

    --where a.Blocked = 0 --and Class = ' '

    group by SearchNo

    having Count(*) > 1

    Ultimately I need a result set that returns 'ATMR10 MFZ N', and 'QBGFEP2050 CH DIS'.

    Thank you for your input.

  • Because Historical is an INT data type. When you insert ' ' to it, SQL converts it to 0. Look at the output from your first query.

    So you need to tell us, should that field be INT or a VARCHAR (as the rest of the VALUES Inserted imply).

    Having said that, I think this gets you the results you're looking for (taking advantage of the fact that the 0 value for Historical sorts to the top of the list).

    SELECT ItemNo=MAX(CASE rn WHEN 1 THEN ItemNo END)

    ,SearchNo

    ,COUNT(*)

    FROM

    (

    SELECT ItemNo, SearchNo

    ,rn=ROW_NUMBER() OVER (PARTITION BY SearchNo ORDER BY Historical)

    FROM #Items

    ) a

    GROUP BY SearchNo

    HAVING COUNT(*) > 1;

    Edit: Forgot the HAVING clause.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Thanks Dwain. Very Nicely Done! This won't be the first time you've helped me. Perhaps one day I can repay you.

    Also I checked the real table definition and the Historical field is a varchar 14.

  • kwoznica (2/18/2014)


    Thanks Dwain. Very Nicely Done! This won't be the first time you've helped me. Perhaps one day I can repay you.

    Also I checked the real table definition and the Historical field is a varchar 14.

    Glad I could help. VARCHAR made the most sense to me given your INSERT statement.

    BTW. While what I gave you will work for sure with that data type, this would also:

    SELECT ItemNo=MAX(CASE rn WHEN 1 THEN ItemNo END)

    ,SearchNo

    ,COUNT(*)

    FROM

    (

    SELECT ItemNo, SearchNo

    ,rn=ROW_NUMBER() OVER (PARTITION BY SearchNo ORDER BY NULLIF(Historical, ''))

    FROM #Items

    ) a

    GROUP BY SearchNo

    HAVING COUNT(*) > 1;

    Call the NULLIF an insurance policy (NULLs usually sort first based on a setting in the db, I think SET ANSI_NULLS ON or something like that). A blank will also sort ahead of most other stuff, but possibly not if special characters are included in the mix.

    Pay me back by buying me a few beers the next time you're in Bangkok.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply