finding duplicates users

  • Hi,

    I'm trying to get the 'leaver's date' of employees who have left our organisation. However, I came across an issue in that if a previous employee is re-employed they get a new record - the old one isn't reclaimed for some reason. So when I am querying the table using 'where leaver = 1' there is a chance that there is also another record for this person if they have been re-employed and current.

    Being quite new to T-SQL I am finding this challenging. I have done some research and come up with a solution but I feel its clunky. Here is what I have done so far:
    select
     e1.EmployeeId,e1.Forename,e1.Surname,e1.DateOfBirth,E1.leaver,E1.EmployeeStartDate,E1.LeftDate
    FROM Employee e1
    INNER JOIN Employee e2 ON e2.DateOfBirth = e1.DateOfBirth
    WHERE E1.Surname = E2.Surname AND e1.Forename = E2.Forename AND E1.leaver > E2.leaver
    This gives me a result set of people who left and have returned, and currently employed. I thought I could then grab the EmployeeIds and omit them from my query.
    If I now use SELECT * FROM employee WHERE EmployeeId NOT IN (64, 746, 50, 500,747, 603,704,528,398,562)
    and leaver ='1'
    order by surname
    I get a result set of people who have left and no longer employed.
    I'm struggling on how to combine these two processes into one T-SQL query.

    Thanks in advance,


  • SELECT
        EmployeeId, Forename, Surname, DateOfBirth, leaver, EmployeeStartDate, LeftDate
    FROM (
        SELECT *, ROW_NUMBER() OVER(PARTITION BY EmployeeId
            ORDER BY CASE WHEN Leaver = 1 THEN LeftDate ELSE EmployeeStartDate END DESC) AS row_num
        FROM dbo.Employee
    ) AS query1
    WHERE row_num = 1 AND
        Leaver = 1

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • ScottPletcher - Wednesday, October 3, 2018 3:15 PM

    Thanks for the response. using your code, In the result set I still see some records whereby the user is both current and a leaver. For example,
    user Joe bloggs has three records in the employee table
    (previous employments)
    id = 603, fn=Joe,sn=bloggs, dob = 01/01/70, leaver =1, LeftDate= 2015
    id =704, fn=Joe,sn=bloggs,dob = 01/01/70,leaver =1,LeftDate= 2017

    (current employment)
    if i run a query using 'where leaver =0' he appears once
    id=852,  fn=Joe,sn=bloggs,dob = 01/01/70, leaver =0, LeftDate = null

    My aim is to get the 'left date' of people who are no longer employed at our company.  Although Joe has left twice before I don't want him to show in the leavers result set as he is now a current user.

    Thanks

  • Try this on for size:
    WITH LEAVERS AS (

        SELECT    *,
            ROW_NUMBER() OVER(PARTITION BY EmployeeId ORDER BY LeftDate DESC) AS row_num
        FROM dbo.Employee AS E
        WHERE E.Leaver = 1
    )
    SELECT
        L.EmployeeId,
        L.Forename,
        L.Surname,
        L.DateOfBirth,
        L.leaver,
        L.EmployeeStartDate,
        L.LeftDate
    FROM LEAVERS AS L
    WHERE L.row_num = 1;

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • Try this: 

    SELECT a.EmployeeId, a.Forename, a.Surname, a.DateOfBirth, a.leaver, a.EmployeeStartDate, a.LeftDate,
            row_number() over (partition by a.forname, a.surname, a.dob order by a.empid) as rn, b.count_reemployed
    FROM (
    SELECT *, ROW_NUMBER() OVER(PARTITION BY E.EmployeeId, E.surname, E.forename
    ORDER BY CASE WHEN E.Leaver = 1 THEN E.LeftDate ELSE E.EmployeeStartDate END DESC) AS row_num
    FROM dbo.Employee E
    ) AS a

    LEFT JOIN
    (
    SELECT forename, surname, DateOfBirth, count(forename+surname+convert(varchar(11), DateOfBirth)) count_reemployed from dbo.Employee
    -- the joint of forename, surname and dob may serve you as a key to identify a reemployeed
    group by forename, surname, dob
    ) AS b
    ON b.dob=a.dob

    WHERE row_num= 1 AND b.count_reemployed =1 AND LeftDate IS NOT NULL

  • osxman - Wednesday, October 3, 2018 4:53 PM

    ScottPletcher - Wednesday, October 3, 2018 3:15 PM

    Thanks for the response. using your code, In the result set I still see some records whereby the user is both current and a leaver. For example,
    user Joe bloggs has three records in the employee table
    (previous employments)
    id = 603, fn=Joe,sn=bloggs, dob = 01/01/70, leaver =1, LeftDate= 2015
    id =704, fn=Joe,sn=bloggs,dob = 01/01/70,leaver =1,LeftDate= 2017

    (current employment)
    if i run a query using 'where leaver =0' he appears once
    id=852,  fn=Joe,sn=bloggs,dob = 01/01/70, leaver =0, LeftDate = null

    My aim is to get the 'left date' of people who are no longer employed at our company.  Although Joe has left twice before I don't want him to show in the leavers result set as he is now a current user.

    Thanks

    So is the EmployeeStartDate not properly populated with a later start date?

    Without usable sample data, it's impossible to test for those types of things.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • Hi osxman, I've updated my reply. It should work.

  • keneangbu - Thursday, October 4, 2018 9:29 AM

    Hi Rookie, I've updated my reply. It should work.

    Hi,

    Works a treat - the count reemployed is a nice touch!
    Thanks for your assistance.

    Cheers

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply