Table Joining

  • Hi guys,

    Just a quick question.

    Below I'm joining onto a table with a WHERE clause in two different ways and I'm wondering what is generally considered a "better" option. When I say better option, I mean an overall better option.

    First Way (Normal Way)


    SELECT A.FirstName, A.LastName, B.Location
    FROM TableA A
    LEFT OUTER JOIN TableB B
    ON A.PersonId = B.PersonId

    WHERE B.Location = 'London'

    Second Way (Joining onto a derived table)

    SELECT A.FirstName, A.LastName, GetLocation.Location
    FROM TableA
    LEFT OUTER JOIN (SELECT PersonId, Location FROM TableB WHERE Location = 'London')GetLocation
    ON A.PersonId = GetLocation.PersonId

    I as a beginner, somehow, started off by using the second way because I believed that joining onto a derived table "targeted" more specifically the columns I was interested in and that it "saved" scanning other columns. In other words I believed that being more explicit was better. Later on, I started using the first way. The first way seems to give me more flexibility if later on in my query I need the values of other columns. 

    Now I'm just confused. So between the first way and the second way, are there really any actual differences and if there are, what are they. Are they important differences? Does it matter which way I choose? What should I generally have in mind when performing joins?

    I'd appreciate some general feedback or perhaps some rules of thumb to keep in mind when performing joins. I've bought Itzik Ben Gans book on T-SQL Fundamentals Third Edition, read through the joins part and have generally understood the concept but things like the examples above are obviously things that probably need to be asked.

    Any feedback appreciated.

    Thanks guys.

  • NikosV - Thursday, September 20, 2018 12:56 PM

    Hi guys,

    Just a quick question.

    Below I'm joining onto a table with a WHERE clause in two different ways and I'm wondering what is generally considered a "better" option. When I say better option, I mean an overall better option.

    First Way (Normal Way)


    SELECT A.FirstName, A.LastName, B.Location
    FROM TableA A
    LEFT OUTER JOIN TableB B
    ON A.PersonId = B.PersonId

    WHERE B.Location = 'London'

    Second Way (Joining onto a derived table)

    SELECT A.FirstName, A.LastName, GetLocation.Location
    FROM TableA
    LEFT OUTER JOIN (SELECT PersonId, Location FROM TableB WHERE Location = 'London')GetLocation
    ON A.PersonId = GetLocation.PersonId

    I as a beginner, somehow, started off by using the second way because I believed that joining onto a derived table "targeted" more specifically the columns I was interested in and that it "saved" scanning other columns. In other words I believed that being more explicit was better. Later on, I started using the first way. The first way seems to give me more flexibility if later on in my query I need the values of other columns. 

    Now I'm just confused. So between the first way and the second way, are there really any actual differences and if there are, what are they. Are they important differences? Does it matter which way I choose? What should I generally have in mind when performing joins?

    I'd appreciate some general feedback or perhaps some rules of thumb to keep in mind when performing joins. I've bought Itzik Ben Gans book on T-SQL Fundamentals Third Edition, read through the joins part and have generally understood the concept but things like the examples above are obviously things that probably need to be asked.

    Any feedback appreciated.

    Thanks guys.

    First, run the queries and compare the output, you should notice that they are different.

    The first query is actually an INNER JOIN as you are filtering on the B.Location column being equal to 'London'.  This will only be true where there is data from TableB.

    The second query will return a rows of data from TableA plus any data from TableB where B.Location is equal to 'London'.

    You could also write the second query like this:

    SELECT
    A.FirstName
    , A.LastName
    , B.Location
    FROM
    TableA A
    LEFT OUTER JOIN TableB B
      ON A.PersonId = B.PersonId and
       B.Location = 'London';

  • NikosV - Thursday, September 20, 2018 12:56 PM

    Hi guys,

    Just a quick question.

    Below I'm joining onto a table with a WHERE clause in two different ways and I'm wondering what is generally considered a "better" option. When I say better option, I mean an overall better option.

    First Way (Normal Way)


    SELECT A.FirstName, A.LastName, B.Location
    FROM TableA A
    LEFT OUTER JOIN TableB B
    ON A.PersonId = B.PersonId

    WHERE B.Location = 'London'

    Second Way (Joining onto a derived table)

    SELECT A.FirstName, A.LastName, GetLocation.Location
    FROM TableA
    LEFT OUTER JOIN (SELECT PersonId, Location FROM TableB WHERE Location = 'London')GetLocation
    ON A.PersonId = GetLocation.PersonId

    I as a beginner, somehow, started off by using the second way because I believed that joining onto a derived table "targeted" more specifically the columns I was interested in and that it "saved" scanning other columns. In other words I believed that being more explicit was better. Later on, I started using the first way. The first way seems to give me more flexibility if later on in my query I need the values of other columns. 

    Now I'm just confused. So between the first way and the second way, are there really any actual differences and if there are, what are they. Are they important differences? Does it matter which way I choose? What should I generally have in mind when performing joins?

    I'd appreciate some general feedback or perhaps some rules of thumb to keep in mind when performing joins. I've bought Itzik Ben Gans book on T-SQL Fundamentals Third Edition, read through the joins part and have generally understood the concept but things like the examples above are obviously things that probably need to be asked.

    Any feedback appreciated.

    Thanks guys.

    The two are different and will give different results, so which one you use depends on which results you want.  The first one, the WHERE clause converts your outer join to an inner join, so you should use the inner join instead of this particular one if it produces the results you want.

    The second one also has an alternate form that is much simpler for relatively easy criteria.  You will probably still want to use your original form if the WHERE clause gets too complex, and you should only use this form with outer joins.  An inner join, it is much better to clearly separate the join conditions from the filter conditions.

    Here is sample data.

    CREATE TABLE #Person
    (
        PersonID  TINYINT PRIMARY KEY,
        FirstName VARCHAR(50),
        LastName VARCHAR(50)
    )

    INSERT #Person(PersonID, FirstName, LastName)
    VALUES (1, 'Paul', 'McCartney'), (2, 'John', 'Lennon'), (3, 'George', 'Harrison'), (4, 'Ringo', 'Starr')

    CREATE TABLE #PersonLocations(
    (
        PersonID TINYINT FOREIGN KEY REFERENCES #Person(PersonID),
        PersonLocation VARCHAR(50)
    )

    INSERT #PersonLocations
    VALUES(1, 'London'), (1, 'Liverpool'), (3, 'Liverpool'), (4, 'London')

    SELECT *
    FROM #Person p
    LEFT OUTER JOIN #PersonLocations pl
        ON p.PersonID = pl.PersonID
    WHERE pl.PersonLocation = 'London'

    SELECT *
    FROM #Person p
    INNER JOIN #PersonLocations pl
        ON p.PersonID = pl.PersonID
    WHERE pl.PersonLocation = 'London'

    SELECT *
    FROM #Person p
    LEFT OUTER JOIN ( SELECT pl.PersonID, pl.PersonLocation FROM #PersonLocations pl WHERE pl.PersonLocation = 'London' ) l
        ON p.PersonID = l.PersonID

    SELECT *
    FROM #Person p
    LEFT OUTER JOIN #PersonLocations pl
        ON p.PersonID = pl.PersonID
            AND pl.PersonLocation = 'London'

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • I'd go with Lynn's suggestion of putting it in the ON clause of the join conditions, it's the cleanest syntax whilst avoiding accidentally converting an OUTER join into an INNER one as in the first variant posted.

  • As to the query plans for these:

    SELECT A.FirstName, A.LastName, B.Location
    FROM TableA A
    LEFT OUTER JOIN TableB B
    ON A.PersonId = B.PersonId AND B.Location = 'London'

    SELECT A.FirstName, A.LastName, GetLocation.Location
    FROM TableA
    LEFT OUTER JOIN (SELECT PersonId, Location FROM TableB WHERE Location = 'London')GetLocation
    ON A.PersonId = GetLocation.PersonId


    I'd expect them to be identical.

    That is, the first syntax is cleaner, so use it, since the second is not needed to help the optimizer create the best plan.  In these cases, the optimizer will figure all that out on its own.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply