Home Forums SQL Server 2008 T-SQL (SS2K8) Identify Duplicate Records according to Multiple Criteria RE: Identify Duplicate Records according to Multiple Criteria

  • ChrisM@Work (12/2/2013)


    Here's the simplest way. Without further information it's unlikely to be the best.

    -- I have an Employee table with Columns:

    DROP TABLE #Employee

    CREATE TABLE #Employee (EmpID INT IDENTITY(1,1), FirstName VARCHAR(100), MiddleName VARCHAR(100), LastName VARCHAR(100),

    SSN VARCHAR(25), BirthDate DATE, HireDate DATE, City VARCHAR(100), Zip VARCHAR(10))

    INSERT INTO #Employee (FirstName, MiddleName, LastName, SSN, BirthDate, HireDate, City, Zip)

    SELECT 'John', NULL, 'Smith', 'Smooth bitter', GETDATE(), GETDATE(), 'London', 'W1' UNION ALL

    SELECT 'Timothy', NULL, 'Taylor', 'Landlord', GETDATE(), GETDATE(), 'London', 'W1' UNION ALL

    SELECT 'Timothy', NULL, 'Taylor', 'Y', GETDATE()+1, GETDATE()+1, 'London', 'W1' UNION ALL

    SELECT 'Timothy', NULL, 'Taylor', 'X', GETDATE(), GETDATE(), 'P', 'Q' UNION ALL

    SELECT 'X', NULL, 'Y', 'Landlord', GETDATE(), GETDATE(), 'Z', 'A'

    -- Requirement: I have to identify the duplicate records based on below criteria:

    SELECT

    EmpID, FirstName, MiddleName, LastName, SSN, BirthDate, HireDate, City, Zip,

    Criteria1, Criteria2, Criteria3

    FROM (

    SELECT

    EmpID, FirstName, MiddleName, LastName, SSN, BirthDate, HireDate, City, Zip,

    Criteria1 = COUNT(*) OVER(PARTITION BY SSN),

    Criteria2 = COUNT(*) OVER(PARTITION BY FirstName, LastName, BirthDate),

    Criteria3 = COUNT(*) OVER(PARTITION BY FirstName, LastName, City, Zip)

    FROM #Employee

    ) d

    WHERE Criteria1+Criteria2+Criteria3 > 3

    I could certainly be wrong but this seemed to be a school assignment which is why I wanted the OP to show what has been tried. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)