Trying to explain LEFT OUTER JOIN

  • I have the following code:

    SELECT b.Fault,

    ISNULL(SUM(CASE WHEN a.status_time IS NOT NULL THEN a.status_time

    WHEN a.status_time IS NULL THEN DATEDIFF(ss,status_change,getdate())end),0)AS sec,

    b.status

    FROM AP4_packing_leg1_HSM_status a

    INNER JOIN

    AP4_packing_leg1_HSM_Faults b

    ON a.status = b.status and a.batch = 105

    group by a.status, b.Fault, b.status

    ORDER BY b.status

    The results from this exclude any values in table A if there is no match in B.status. I want the SUM of A.status_time along with B.status (NULL) where this is the case.

    If I use a LEFT OUTER JOIN as shown in Venn diagrams I get too many results.

    The working tables have only three rows where a. batch = 105, the LEFT OUTER JOIN query returns 7 rows?

    Could anyone explain what is going on here please?

  • With a LEFT JOIN, you take all the rows from the left table.

    If AP4_packing_leg1_HSM_status has 7 rows, you will get 7 rows.

    Putting a.batch = 105 in the ON clause doesn't filter on the rows of table a. It is a join criteria, meaning that if a.batch <> 105, you just get NULL values for the columns from b. But the rows from a are still returned.

    If you want to actually filter on that criteria, you need to include a.batch = 105 in a WHERE clause.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • -- Simplify the query and examine the unaggregated result of the INNER JOIN

    SELECT

    a.[status], a.batch,

    b.[status], b.batch

    FROM AP4_packing_leg1_HSM_status a

    INNER JOIN AP4_packing_leg1_HSM_Faults b

    ON a.[status] = b.[status]

    AND a.batch = 105

    --group by a.status, b.Fault, b.status

    ORDER BY b.[status]

    -- and the LEFT JOIN

    --Perhaps you should be joining on batch as well as [status]?

    SELECT

    a.[status], a.batch,

    b.[status], b.batch

    FROM AP4_packing_leg1_HSM_status a

    INNER JOIN AP4_packing_leg1_HSM_Faults b

    ON a.[status] = b.[status]

    AND a.batch = b.batch

    WHERE a.batch = 105

    β€œWrite the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Thanks for the help!

    It is true (of course) that the WHERE clause sorts out the LEFT OUTER JOIN (thanks). The only niggle is I remember moving the a.batch = 105 from a WHERE to the JOIN for what was a valid reason at the time but cannot remember the details now. Perhaps I should leave both in???

    Thanks so much πŸ™‚

  • brett.y (9/4/2014)


    Thanks for the help!

    It is true (of course) that the WHERE clause sorts out the LEFT OUTER JOIN (thanks). The only niggle is I remember moving the a.batch = 105 from a WHERE to the JOIN for what was a valid reason at the time but cannot remember the details now. Perhaps I should leave both in???

    Thanks so much πŸ™‚

    I usually put filters in the ON clause when I filter on columns from the right table and there might be issues with NULL values (because non-matching rows will get all NULL values for columns of the right tables).

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • Take all the rows from the left table.

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply