Left outer join brings duplicate records

  • Hi There,
    I have a query below
    SELECT
    *
    From
    (select * from CricketStaging_pass_SP_ID where EmaiNameDuplicated = 'T' and EmaiNameDuplicatedParent = 1 and ISNULL(EmailNameIncomplete,'') = '') as DataLeft
    left outer join (select * from CricketStaging_pass_SP_ID where EmaiNameDuplicated = 'T' and EmaiNameDuplicatedParent = 0 and ISNULL(EmailNameIncomplete,'') = '') as DataRight
    on DataLeft.Email = DataRight.Email
    and DataLeft.FirstName = DataRight.FirstName
    and DataLeft.LastName = DataRight.LastName

    When you query first Dataleft part "select * from CricketStaging_pass_SP_ID where EmaiNameDuplicated = 'T' and EmaiNameDuplicatedParent = 1 and ISNULL(EmailNameIncomplete,'') = ''" you get below results

    ID=63600
    Email:parth.patel@microchannel.com.au

    When you query first Dataleft part "select * from CricketStaging_pass_SP_ID where EmaiNameDuplicated = 'T' and EmaiNameDuplicatedParent = 0 and ISNULL(EmailNameIncomplete,'') = ''" you get below results

    First Row:
    ID=2249051
    Email:parth.patel@microchannel.com.au

    Second Row:
    ID=2998983
    Email:parth.patel@microchannel.com.au

    When the first query is fired i.e. below:
    SELECT
    *
    From
    (select * from CricketStaging_pass_SP_ID where EmaiNameDuplicated = 'T' and EmaiNameDuplicatedParent = 1 and ISNULL(EmailNameIncomplete,'') = '') as DataLeft
    left outer join (select * from CricketStaging_pass_SP_ID where EmaiNameDuplicated = 'T' and EmaiNameDuplicatedParent = 0 and ISNULL(EmailNameIncomplete,'') = '') as DataRight
    on DataLeft.Email = DataRight.Email
    and DataLeft.FirstName = DataRight.FirstName
    and DataLeft.LastName = DataRight.LastName

    I get duplicate data.

    First Row:
    ID=63600
    Email:parth.patel@microchannel.com.au

    Second Row:
    ID=63600
    Email:parth.patel@microchannel.com.au

    The columns in the table are text - can you please advise how should this be fixed?

  • parth.patel 9337 - Thursday, March 1, 2018 4:55 PM

    The columns in the table are text - can you please advise how should this be fixed?

    There is not enough information here to give a 'correct' anwser.
    Duplicate data can be 'solved' by using  SELECT DISTINCT ......

    If you are not expecting duplicate rows, first analyse why you are getting the duplicate rows. Very often when you get duplicate data this is an indication that the query is not correct. Analyse first then solve the problem. Using SELECT DISTINCT, will give you distinct rows, but is a bit of a 'last resort' method.

    Ben

  • You are only matching on email, firstname and lastname. Your ID and the EmaiNameDuplicatedParent is what differs, and the same email is in both left and right. So you're not getting duplicate data, you're getting what you've asked for.

    Why aren't you just looking for everything where EmaiNameDuplicatedParent = 1? what is the purpose of creating two derived tables and joining them just to exclude things with a zero value in that column?

    -------------------------------------------------------------------------------------------------------------------------------------
    Please follow Best Practices For Posting On Forums to receive quicker and higher quality responses

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply