Get last row based on duplicate column

  • I have a table that contains:

    userid email name

    1 a@a.com Bill

    2 a@a.com Billy

    3 a@a.com William

    4 b@b.com John

    5 b@b.com Johnny

    6 c@c.com Jimmy

    7 c@c.com Jim

    I am trying to get the output so that it prints only the last line with the matching email:

    userid email name

    3 a@a.com William

    5 b@b.com Johnny

    7 c@c.com Jim

    I have this:

    SELECT DISTINCT *

    FROM USERS AS A

    INNER JOIN (

    SELECT USERID, EMAIL, NAME

    FROM USERS

    GROUP BY USERID, EMAIL NAME

    HAVING COUNT(*) > 1) AS B

    ON A.EMAIL = B.EMAIL

    But I get nothing back.

    What am I missing?

    Thanks for all help.

  • If you setup test data this becomes easier, check out the first link in my sig for what we'd prefer when we do code assistance.

    In general, what you're dealing with is getting the last detail from a logging table. This is usually done by grouping on the identifiers (in this case, email), then using MAX() on a 'last row determinator', in this case userID... then joining the result of that back to the original.

    I don't have code but basically, it'll look like this:

    SELECT

    t.UserID,

    t.Email,

    t.Name

    FROM

    (SELECT

    Email,

    Max(UserID) AS MaxID

    FROM

    Table

    GROUP BY

    Email

    ) AS drv

    JOIN

    Table AS t

    ONdrv.Email = t.Email

    AND drv.MaxID = t.UserID


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • That did the trick.

    Thank you very much.

  • This might do it for you:

    CREATE TABLE #T(userid INT, emailname VARCHAR(50))

    INSERT INTO #T

    SELECT 1, 'a@a.com Bill' UNION ALL

    SELECT 2, 'a@a.com Billy' UNION ALL

    SELECT 3, 'a@a.com William' UNION ALL

    SELECT 4, 'b@b.com John' UNION ALL

    SELECT 5, 'b@b.com Johnny' UNION ALL

    SELECT 6, 'c@c.com Jimmy' UNION ALL

    SELECT 7, 'c@c.com Jim'

    ;WITH cte

    as (select row_number() over(partition by substring(emailname,1,7) order by userid DESC) as rn,emailname

    from #T)

    select * from cte where rn = 1

    Results:

    rnemailname

    1a@a.com William

    1b@b.com Johnny

    1c@c.com Jim

    If everything seems to be going well, you have obviously overlooked something.

    Ron

    Please help us, help you -before posting a question please read[/url]
    Before posting a performance problem please read[/url]

  • just go with practical scenario it may happen that email id same but domain different

    in that case Just remove substring from "bitbucket-25253" solution

    declare @T TABLE

    (userid INT, emailname VARCHAR(50),name varchar(10))

    INSERT INTO @T

    SELECT 1, 'test123test@b.com','Bill' UNION ALL

    SELECT 1, 'test123test@a.com','Billiiiii' UNION ALL

    SELECT 2, 'a@a.com','Billy' UNION ALL

    SELECT 3, 'a@a.com','William' UNION ALL

    SELECT 4, 'b@b.com','John' UNION ALL

    SELECT 5, 'b@b.com','Johnny' UNION ALL

    SELECT 6, 'c@c.com','Jimmy' UNION ALL

    SELECT 7, 'c@c.com','Jim'

    ;WITH cte

    as (select row_number() over(partition by emailname order by userid DESC) as rn,emailname ,name

    from @T)

    select * from cte where rn = 1

  • This has already been answered but it's interesting that the logical requirement exactly matches an existing function which isn't used in any of the solutions offered. What you're looking for - the logical requirement - is the maximum userid per partition of email, and the function is;

    MAX(userid) OVER(PARTITION BY email)

    Here's the query;

    ;WITH SampleData (userid, email, name) AS (

    SELECT 1, 'a@a.com', 'Bill' UNION ALL

    SELECT 2, 'a@a.com', 'Billy' UNION ALL

    SELECT 3, 'a@a.com', 'William' UNION ALL

    SELECT 4, 'b@b.com', 'John' UNION ALL

    SELECT 5, 'b@b.com', 'Johnny' UNION ALL

    SELECT 6, 'c@c.com', 'Jimmy' UNION ALL

    SELECT 7, 'c@c.com', 'Jim'

    ),

    MyQuery AS (

    SELECT userid, email, name,

    ChosenUserID = MAX(userid) OVER(PARTITION BY email)

    FROM SampleData

    )

    SELECT userid, email, name

    --, ChosenUserID

    FROM MyQuery

    WHERE userid = ChosenUserID

    ORDER BY userid

    It's unlikely that it will perform any different to the other solutions, but you can deduce - at a swift glance - the intent of the code.


    [font="Arial"]Low-hanging fruit picker and defender of the moggies[/font]

    For better assistance in answering your questions, please read this[/url].


    Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

    Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

  • ChrisM@home (10/28/2012)


    This has already been answered but it's interesting that the logical requirement exactly matches an existing function which isn't used in any of the solutions offered. What you're looking for - the logical requirement - is the maximum userid per partition of email, and the function is;

    I always seem to forget about doing that, as I've found it to be relatively non-performant in 2k5 where I do most of my work... and I'm a creature of habit, so I use the methods I've been using since 7.0 until someone can show me the other way is faster. 🙂


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply