removing dups for update

  • This is the temp table I have
    ID    empnum    email    fName    lname
    1    123             a@a.com    tim    slim
    2    124             b@b.com    jim    fat
    3    123                               tim    slim1
    4    123                               tim    slim
    5    124    b@b.com             jimmy    fat

    I need to do an update to my database and I am getting this table from HR
    Here are the specific qualifications
    1. a present email take precedence over one with no email. If they both have emails, I need to do an LEN on a combo of first and last name and grab the longest name

    How would I do this select? So I should end up selecting ID 1 and 5

  • The solution will involve ROW_ NUMBER() OVER (PARTITION BY empnum ORDER BY email DESC, LEN(<name field you're measuring> DESC)

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • Alan.B - Wednesday, February 28, 2018 11:36 AM

    The solution will involve ROW_ NUMBER() OVER (PARTITION BY empnum ORDER BY email DESC, LEN(<name field you're measuring> DESC)

    Can't edit from my phone...

    ... then pop that logic in a sub query, name ROW_NUMBER as RN in your subquery then filter for WHERE RN = 1

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • Hmm. not sure I follow

  • davef 22400 - Wednesday, February 28, 2018 12:44 PM

    Hmm. not sure I follow

    Sorry. Normally I would include an example but I was not at a PC. Note the sample data and solution:
    -- Easily Consumable sample data
    DECLARE @table TABLE
    (
    ID int,
    empnum int,
    email varchar(100),
    fName varchar(100),
    lName varchar(100)
    );

    INSERT @table VALUES
    (1, 123, 'a@a.com','tim','slim'),
    (2, 124, 'b@b.com','jim','fat'),
    (3, 123, '', 'tim','slim1'),
    (4, 123, '', 'tim','slim'),
    (5, 124, 'b@b.com','jimmy','fat');

    -- Solution
    SELECT *
    FROM
    (
    SELECT *, RN = ROW_NUMBER() OVER (partition by empnum ORDER BY email DESC)
    FROM @table
    ) removeDupes
    WHERE RN = 1;

    You would have to modify the ROW_NUMBER() ORDER BY Logic to accommodate the logic to handle this (which I did not fully understand):

    I need to do an LEN on a combo of first and last name and grab the longest name

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • Try this on for size:
    DECLARE @table AS TABLE (
        ID int,
        empnum int,
        email varchar(100),
        fName varchar(100),
        lName varchar(100)
    );
    INSERT @table (ID, empnum, email, fName, lName)
        VALUES    (1, 123, 'a@a.com','tim','slim'),
                (2, 124, 'b@b.com','jim','fat'),
                (3, 123, '', 'tim','slim1'),
                (4, 123, '', 'tim','slim'),
                (5, 124, 'b@b.com','jimmy','fat');

    -- Solution
    SELECT *
    FROM (
        SELECT *, RN = ROW_NUMBER() OVER (PARTITION BY email ORDER BY LEN(ISNULL(fName, '') + ISNULL(lName,'')) DESC)
        FROM @table
        WHERE email <> ''
        ) AS X
    WHERE X.RN = 1;

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • Steve. This is brilliant. One thing I forgot to mention is that I am doing a delete from the table of the dups. So this deletes all the good ones. I only need to delete the dups

    DELETE FROM @table
    SELECT *
    FROM (
      SELECT *, RN = ROW_NUMBER() OVER (PARTITION BY email ORDER BY LEN(ISNULL(NAME_FIRST, '') + ISNULL(NAME_LAST,'')) DESC)
      FROM @table
      WHERE email <> ''
      ) AS X
    WHERE X.RN = 1;

  • davef 22400 - Thursday, March 1, 2018 5:17 AM

    Steve. This is brilliant. One thing I forgot to mention is that I am doing a delete from the table of the dups. So this deletes all the good ones. I only need to delete the dups

    DELETE FROM @table
    SELECT *
    FROM (
      SELECT *, RN = ROW_NUMBER() OVER (PARTITION BY email ORDER BY LEN(ISNULL(NAME_FIRST, '') + ISNULL(NAME_LAST,'')) DESC)
      FROM @table
      WHERE email <> ''
      ) AS X
    WHERE X.RN = 1;

    That just means we change the query, as follows:DECLARE @DELETED AS TABLE (
        ID int,
        empnum int,
        email varchar(100),
        fName varchar(100),
        lName varchar(100)
    );

    DECLARE @table AS TABLE (
        ID int,
        empnum int,
        email varchar(100),
        fName varchar(100),
        lName varchar(100)
    );
    INSERT INTO @table (ID, empnum, email, fName, lName)
        VALUES (1, 123, 'a@a.com','tim','slim'),
                (2, 124, 'b@b.com','jim','fat'),
                (3, 123, '', 'tim','slim1'),
                (4, 123, '', 'tim','slim'),
                (5, 124, 'b@b.com','jimmy','fat');

    -- Solution
    DELETE
    FROM @table
    OUTPUT DELETED.*
    INTO @DELETED
    WHERE ID IN (
        SELECT ID
        FROM (
            SELECT *, RN = ROW_NUMBER() OVER (PARTITION BY email ORDER BY LEN(ISNULL(fName, '') + ISNULL(lName,'')) DESC)
            FROM @table
            WHERE email <> ''
            ) AS X
        WHERE X.RN <> 1
        );

    SELECT *
    FROM @DELETED;

    SELECT *
    FROM @table;

    NOTE: I added a table to capture what was deleted so that you can see what this is doing.  Let me know whether or not this is the correct result.

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply