Simple LEFT JOIN gives surprising result - Why?

  • Programming in SQL is a humbling experience. I thought I knew just what this would return:

    [font="Courier New"]create table a (a1 int primary key, a2 int)

    create table b (b1 int primary key, a1 int)

    insert a values (1, 10)

    insert a values (2, 20)

    insert b values (2, 2)

    select *

    from a

    left join b

    on a.a1 = b.a1

    and a.a1 = 1

    [/font]

    Here's the results:

    a1a2b1a1

    110NULLNULL

    220NULLNULL

    Can someone explain to me why the second row is there? Obviously, moving the line a.a1 = 1 into a WHERE clause will eliminate the row, but I'd like a technical explanation as to why SQL null-extends the second row and includes it in the result set. I'd understand this action if the ON clause referenced a b column and a constant, but I thought that the inner table to constant conditions in an ON clause behaved identically to that condition in a WHERE clause.

    My understanding of LEFT JOIN was: return NULLS for all columns of null-extended table (b, in this case) where criteria involving that table is not met. However, here is what BOL says on outer joins:

    [font="Courier New"]

    Inner joins eliminate the rows that do not match with a row from the other table. Outer joins, however, return all rows from at least one of the tables or views mentioned in the FROM clause, as long as those rows meet any WHERE or HAVING search conditions. All rows are retrieved from the left table referenced with a left outer join, and all rows from the right table referenced in a right outer join. All rows from both tables are returned in a full outer join

    [/font]

    Wow, that's just not the way I've viewed outer joins for the last 10 years of daily programming. Does this surprise anyone else?

    Vince

  • A follow-up observation. Based on the explanation in BOL for outer joins, any criteria of inner join table column to a constant in an ON clause of an outer join is meaningless.

    That being the case, why is there no warning to that effect generated by SQL, similar to the warning generated when an outer-joined table is later inner-joined in the query?

    I would go a step further and state that such syntax (in an outer join ON clause, inner-join table column to constant) should generate a compile-time error. If the purpose of the JOIN clause is simply to put two tables together, yet not limit the number of rows put together in the process, then specifying such a statement intended to limit rather than join should be illegal.

    But here's where I really get on my soapbox. Let's slightly alter the syntax of the join, replacing the LEFT JOIN with a JOIN (and using 2 as the constant so we get something back):

    [font="Courier New"]

    create table a (a1 int primary key, a2 int)

    create table b (b1 int primary key, a1 int)

    insert a values (1, 10)

    insert a values (2, 20)

    insert b values (2, 2)

    select *

    from a

    join b

    on a.a1 = b.a1

    and a.a1 = 2

    [/font]

    Results:

    a1a2b1a1

    22022

    In this case, the ON criteria a.a1 = 2 limited results rather than linked rows from two tables, and it works as expected.

    Consider the ramifications. A statement exists in a query where, in an INNER JOIN, one of the first table's columns is joined to a constant (or T-SQL @variable) as limiting criteria in the ON clause. Testing reveals situations where rows do not exist in the second table, failing the join, so "INNER JOIN" is replaced by "LEFT JOIN" . Poof! The limiting criteria in the ON clause just became meaningless!

    My colleague and I are in shock, wondering how many of our procedures coded over the years have this subtle issue in outer joins.

    Are we just dense?

    Vince

  • I do not see any problem.

    What logically happens for a left join is:

    1. An inner join is done.

    2. Any rows in the left table not in the result are added.

    3. The Where clause is evaluated.

    etc

    SQL has many subtleties - you should always test that you are getting the results you desire.

  • I agree.

    Logically you're defining the join criteria in the ON clause. Based on that criteria, looking at the code, your left join worked perfectly. If you wish to filter the result set, that's what the WHERE clause is for. This makes sense.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • In my experience - just using the word "join" usually tells SQL server "inner join", whereas "left join" = LEFT OUTER JOIN.

    And while the results might look a little strange at first, it goes to highlight the difference between JOINING criteria, and limiting criteria. In an outer join scenario, the joining criteria aren't going to be limiting on the "side" it's defined on.

    Ultimately - something not actually defining the relationship between the two tables should never be in the ON clause.

    I do feel your pain if you have a lot of this code out there - could rear its ugly little head just about anywhere.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • It appears that the query is working exactly as described in SQL Server Books Online.

  • Yes, subtle and the reason that I (almost) never use a constant in the ON. By adding one more row to the original query it becomes clearer:

    create table a (a1 int primary key, a2 int)

    create table b (b1 int primary key, a1 int)

    insert a values (1, 10)

    insert a values (2, 20)

    insert b values (2, 2)

    insert b values (20, 1) -- The new row

    select *

    from a

    left join b

    on a.a1 = b.a1

    and a.a1 = 1

    drop table a

    drop table b

    a1 a2 b1 a1

    --- --- --- ---

    1 10 20 1

    2 20 NULL NULL

    This makes it a little easier to see that the ON is controlling which b rows join a, but not filtering the result set. (So, this is positive reinforcement to not do this.)

  • Thanks for the comments. Yes, it works as described in BOL, just not as I would have thought.

    How about some comments on the warning suggestion? Since the criteria of a column to a constant works as one would expect in an inner join, yet is worthless in an outer join, shouldn't there be a warning to that effect emitted by SQL?

    Bottom line, as suggested, constants go only in the where clause unless required to make the outer join make sense.

  • You know that I have high regard for your T-SQL understanding.

    But, sorry, but I don't think this needs a warning any more than many other sneaky behaviors of SQL Server. At least this one is syntactically defined and is not an optimizer decision.

    The two ways of expressing this query give two differently purposed result sets. Although I agree that it can be surprising, I do not agree that the constant in the ON clause is meaningless.

    -- Give me all 'a' rows.

    -- If a.a1 = 1 = b.a1

    -- also join a 'b' row to my 'a' row

    select *

    from a

    left join b

    on a.a1 = b.a1

    and a.a1 = 1

    a1 a2 b1 a1

    --- --- --- ---

    1 10 20 1

    2 20 NULL NULL

    -- Give me all 'a' rows where 'a.a1' = 1.

    -- If a.a1 = b.a1 (and thus = 1)

    -- also join a 'b' row to my 'a' row

    select *

    from a

    left join b

    on a.a1 = b.a1

    WHERE b.a1 = 1

    a1 a2 b1 a1

    --- --- --- ---

    1 10 20 1

    Then, just for fun use a FULL OUTER JOIN on the first query.

    -- Give me all 'a' rows and all 'b' rows

    -- If a.a1 = 1 = b.a1 join a 'b' row to my 'a' row

    -- Else show the 'a' and 'b' rows independently.

    select *

    from a

    left join b

    on a.a1 = b.a1

    and a.a1 = 1

    a1 a2 b1 a1

    --- --- --- ---

    1 10 20 1

    2 20 NULL NULL

    NULL NULL 2 2

    But it has been a long time since I thought about this, so thanks for bringing up the problem. It definitely should be in the basic course materials for any T-SQL querying classes and emphasized with a few good examples.

  • Sometimes putting an additional filter, even on a constant, in your join clause is the only way to go. Consider the following:

    Your boss asks for a data extract of all sales people who have sales in a certain month.

    So you whip up this quick query (all using AdventureWorks)

    Select SP.SalesPersonID, Sum(SubTotal)

    From

    Sales.SalesPerson SP

    INNER JOIN Sales.SalesOrderHeader SOH ON SP.SalesPersonID = SOH.SalesPersonID

    where SOH.OrderDate Between '1/1/2003' and '1/31/2003' OR SOH.OrderDate is null

    Group By SP.SalesPersonID

    Your boss comes back to you complaining that she doesn't see all the sales people. You think to yourself "I learned this one in SQL class. I'll just LEFT JOIN, and my problems go away."

    So you come back with

    Select SP.SalesPersonID, Sum(SubTotal)

    From

    Sales.SalesPerson SP

    LEFT JOIN Sales.SalesOrderHeader SOH ON SP.SalesPersonID = SOH.SalesPersonID

    where SOH.OrderDate Between '1/1/2003' and '1/31/2003'

    Group By SP.SalesPersonID

    What the heck? Still only 12 Sales people. I should be getting 17.

    Now you smack your hand on your forehead, and think that your where clause is filtering out NULLs, so you try

    Select SP.SalesPersonID, Sum(SubTotal)

    From

    Sales.SalesPerson SP

    LEFT JOIN Sales.SalesOrderHeader SOH ON SP.SalesPersonID = SOH.SalesPersonID

    where SOH.OrderDate Between '1/1/2003' and '1/31/2003' OR OrderDate IS NULL

    Group By SP.SalesPersonID

    Still no luck. Your filter condition on the table being joined to is causing the sales people in the source table to be filtered out altogether, so you finally come up with

    Select SP.SalesPersonID, Sum(SubTotal)

    From

    Sales.SalesPerson SP

    LEFT JOIN Sales.SalesOrderHeader SOH ON SP.SalesPersonID = SOH.SalesPersonID

    and SOH.OrderDate Between '1/1/2003' and '1/31/2003'

    Group By SP.SalesPersonID

    Ahhh...That's the ticket!

    This can work anytime you are trying to filter on a table being joined to.

    If your boss had asked only for Open Orders, or anything that required some sort of filter, but you still need to get all the records so you can at least report a 0 for those sales people that had no records, this technique can be quite useful.

    Hope this helps.

  • The behavior is exactly as it should be as has been said. And as for a warning... why would you warn when it does what it is supposed to do. I don't even like the ANSI_NULL warnings.

    See the code below.

    SELECT

    secApplicationUsers.UserID,

    ISNULL(tblUserGroupsProperty.PropertyValue,tblSecurityMembershipPropertyItems.ItemID) AS PropertyValue

    FROM tblSecurityMembershipProperty

    INNER JOIN secApplicationUsers

    ON tblSecurityMembershipProperty.ApplicationID = secApplicationUsers.ApplicationID

    LEFT JOIN tblUserGroupsProperty

    ON secApplicationUsers.UserGroupID = tblUserGroupsProperty.UserGroupID

    AND tblSecurityMembershipProperty.MembershipPropertyID= tblUserGroupsProperty.MembershipProperty

    LEFT JOIN tblSecurityMembershipPropertyItems

    ON tblSecurityMembershipPropertyItems.MembershipProperty = tblSecurityMembershipProperty.MembershipPropertyID

    AND tblUserGroupsProperty.MembershipProperty IS NULL

    AND tblSecurityMembershipPropertyItems.DefaultValue = 1

    WHERE

    tblSecurityMembershipProperty.Code=@PropertyName

    AND tblSecurityMembershipProperty.ApplicationID = @ApplicationID

    Without knowing my data I have a situation where I do fhave contants in the ON.

    ON tblSecurityMembershipPropertyItems.MembershipProperty = tblSecurityMembershipProperty.MembershipPropertyID

    AND tblUserGroupsProperty.MembershipProperty IS NULL

    AND tblSecurityMembershipPropertyItems.DefaultValue = 1

    In this case, I show the values if I have one. But if not, I go get the default's defined.

  • Jeremy,

    I appreciate your detailed example, and have faced such queries myself. However, I believe what you present does not address my original point. You've given good reasons for an ON clause comparison of the right table in a LEFT JOIN to a constant. I was addressing the left table to constant comparison.

    See post below.

  • Russell,

    You are the master. I finally understand!!!

    Your clear example and comments finally made it come together in my head. I agree now that a constant on a left table join is not meaningless - it is still specifying which rows are eligible for the join, but the inner table of an outer join will never be limited through the ON clause. The presence of that inner table to constant join (a.a1=1) does alter the result set (whether or not b's rows join or are NULL-extended), therefore it cannot be meaningless. It also can be considered consistent with INNER JOIN behavior, as in both cases it specifies joining rather than limiting criteria.

    Practically speaking, though, joining criteria = limiting criteria for an INNER JOIN and doesn't for an OUTER JOIN, so limiting criteria involving constants belong in the WHERE clause for clarity and robustness.

    I don't think it should have taken 15 years of SQL programming to learn this, but better late than never!

    Vince

  • Table a

    a1 a2

    ====

    1 10

    2 20

    Table b

    b1 a1

    =======

    2 2

    20 1

    query:

    select *

    from a

    left join b

    on a.a1 = b.a1

    and a.a1 = 1

    Result

    a1 a2 b1 a1

    =================

    110201

    220NULLNULL

    The reason why the record (220NULLNULL) appears on the resultset is because of the LEFT

    OUTER JOIN.... when a.a1 =1, b.a1 =1 and a.a1 =1, in table a, its (1,10), in table b, its (20,1)

    these two records join together natually.... when a.a1 =2 , it breaks the join conditions which

    are (a.a1 = b.a1 and a.a1 = 1), so all the rows from the preserving rows table (table a) still show

    up on the resultset, while the NULL-suppling table (table b) gives NULL value to the resultset

    which is (2,20,NULL,NULL)....

    In a sum, if you use left outer join, all the columns from the select clause, if they belong to the preserving row table, they Have to show up no matter if the join conditions are met or not!

  • Hi Jeremy:

    The query below would do the same trick

    WITH SOH AS (

    select salespersonid,subtotal

    from Sales.SalesOrderHeader

    where OrderDate Between '1/1/2003' and '1/31/2003'

    and salespersonid is not null)

    select S.SalesPersonID, sum(SOH.subtotal) as subtotal

    from Sales.SalesPerson S left join SOH

    on S.SalesPersonID = SOH.SalesPersonID

    Group By s.SalesPersonID

    order by subtotal

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply