Complex joins

  • honza.mf (12/6/2011)


    Probably I will use right joins and nested joins only in case of self-defense. I prefer the readability and I think from left to right.

    Absolutely true. This question was intended to illustrate how a real-life problem sometimes forces you to choose the lesser of two evils. Both nested joins and right outer joins reduce readability and ease of understanding of a query - so pick your poison. I have been in this position a few times, and whatever I chose, I always felt it was not the optimal choice.

    BTW, now that I no longer have to support SQL Server 2000, I can use a better solution using a Common Table Expression:

    WITH CustomersWithRecentOrders AS

    (SELECT c.Name, o.OrderDate, c.SalesRep

    FROM dbo.Customers AS c

    INNER JOIN dbo.Orders AS o

    ON o.CustomerID = c.CustomerID

    AND o.OrderDate > DATEADD(month, -4, CURRENT_TIMESTAMP))

    SELECT e.Name AS SalesRep, co.Name AS Customer, co.OrderDate

    FROM dbo.Employees AS e

    LEFT JOIN CustomersWithRecentOrders AS co

    ON co.SalesRep = e.EmployeeID

    WHERE e.Position = 'SalesRep';

    Note that this query does NOT produce the exact same execution plan as the original, but it DOES produce the exact same results. And of all the options, I think this one is the most readable.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • SQL Kiwi (12/6/2011)


    Hi Hugo,

    Writing a question to demonstrate the equivalence of SQL expressions is very brave, so well done for that. Small point: in the explanation you say, "the only way to rewrite the query is to reverse the table order", but this code expresses the same SQL semantic I think:

    Would this not work as well?

    ;WITH cte_Subquery AS

    (SELECT c.SalesRep, c.Name, o.OrderDate

    FROM dbo.Customers AS c

    INNER JOIN dbo.Orders AS o

    ON o.CustomerID = c.CustomerID

    AND o.OrderDate > DATEADD(month, -4, CURRENT_TIMESTAMP))

    SELECT e.Name AS SalesRep, s.Name AS Customer, s.OrderDate

    FROM dbo.Employees AS e

    LEFT JOIN cte_Subquery AS s

    ON e.EmployeeID = s.SalesRep

    WHERE e.Position = 'SalesRep';

    I always go with CTEs in this situation when possible, but I would love to know if I'm off the mark on this one.

    EDIT: Of course I posted this 45 seconds after Hugo beat me to it. At least I know I was thinking along the right lines.

    ron

    -----
    a haiku...

    NULL is not zero
    NULL is not an empty string
    NULL is the unknown

  • Good hard question, made me think. I missed it but I did learn something from the excellent explanation. Thanks for submitting.

    http://brittcluff.blogspot.com/

  • ronmoses (12/6/2011)


    Would this not work as well?

    Yup! (As you already found out after posting).

    The only thing I don't like about your query is the semicolon before WITH. Semicolons belong at the end of each statement, not at the start.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • Hugo Kornelis (12/6/2011)


    BTW, now that I no longer have to support SQL Server 2000, I can use a better solution using a Common Table Expression:

    Even in SQL Server 2000, you can write the equivalent subquery/derived table as expressed by the CTE:

    SELECT

    SalesRep = e.Name,

    Customer = co.Name,

    co.OrderDate

    FROM dbo.Employees AS e

    LEFT JOIN

    (

    SELECT

    c.Name,

    o.OrderDate,

    c.SalesRep

    FROM dbo.Customers AS c

    JOIN dbo.Orders AS o ON

    o.CustomerID = c.CustomerID

    AND o.OrderDate > DATEADD(MONTH, -4, CURRENT_TIMESTAMP)

    ) AS co ON

    co.SalesRep = e.EmployeeID

    WHERE

    e.Position = 'SalesRep';

  • Cadavre (12/6/2011)


    I assumed that understanding the logic of join operators is integral to the majority of the visitors to this site.

    Not, I believe, a safe assumption. In my daily routine as an "Accidental DBA", I write relatively few queries, and very, very few complex queries with joins. I visit the site to learn, so that I have a chance of writing the queries quickly and efficiently when needs arise. (Of course, something is apparently rubbing off, 'cause I apparently got today's question right AND learned something!)

  • michael.kaufmann (12/6/2011)


    One minor flaw I'd attribute to speedily writing this query--the above one will only yield the same result as the original one if only employees are the customers and the IDs as customer and employee are equal.

    Thanks Michael - I did indeed write the syntax quickly, just from the execution plan, before Hugo so helpfully posted table definitions and sample data. That'll teach me, I guess.

  • Thanks Hugo,

    Great question, and between your form and Paul's I now have 2 more cases that my T-SQL formatter doesn't handle correctly; at least it's not mangling the SQL, but the resulting indentation makes little sense (and the parse tree is just wrong).

    Any objection to my adding it to my test suite (which is also the demo list on poorsql.com) once I've fixed it?

    http://poorsql.com for T-SQL formatting: free as in speech, free as in beer, free to run in SSMS or on your version control server - free however you want it.

  • Tao Klerks (12/6/2011)


    Any objection to my adding it to my test suite (which is also the demo list on poorsql.com) once I've fixed it?

    No problem for me!

    Sorry for breaking your formatter - and good luck getting it fixed. 😉


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • Hugo Kornelis (12/6/2011)


    The few timies where I did choose to use nested joins in actual code, I have always used parentheses and indentation for easier reading. In fact, my first draft of this question did use exactly that. I just took it out before submitting the question, because I am mean.:Whistling:

    Yes you are mean! :laugh:

    Actually, I think I rather misunderstood the point of the question. I thought you were intending to illustrate SQL/relational equivalences or something like that, but I now see you were aiming at the 'nested join' idea. I'll have to think about that a bit more to see if I understand the point completely.

    Hugo Kornelis (12/6/2011)


    I stiill stand by what I intended to write - that the only way to rewrite the query to a form that doesn't use nested joins involves reordering the table order and changing the left outer join to the far less cocmmon and harder to grasp right outer join form.

    Hmm - but doesn't the CTE/derived table example break that assertion? Or are they still 'nested joins' somehow. I think that's the point I'm struggling with - what precisely 'nested joins' means.

  • SQL Kiwi (12/6/2011)


    Hmm - but doesn't the CTE/derived table example break that assertion?

    It does. I only realised the CTE possibility later, after I already posted that earlier post.

    (For some reason I can't really explain, I am not very fond of the derived table version - I've used derived tables when needed, but always found the syntax to be lacking readability).


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • Hugo Kornelis (12/6/2011)


    (For some reason I can't really explain, I am not very fond of the derived table version - I've used derived tables when needed, but always found the syntax to be lacking readability).

    I'm somewhere in the middle on this one. I'd like to get you and Adam Machanic together in a room on this point though - he's a fierce advocate of the benefits of derived tables over CTEs. Something about the ability to select blocks of code to run easily I think is one main argument in favour.

  • This is one of the best QOTDs I have ever seen here. I had to really think this through to get it right. I ALMOST just checked #1 out of first impulse but figured it couldn't be that easy, especially once I saw the RIGHT JOIN answers. Wondering what would cause a DBA to make it that tough, I read the first query very carefully and realized that that the inner join would happen first and wouldn't happen the same on the query #1. So I ruled that out and #2 was easy to rule out. #3 screwed up the WHERE clause, so it had to be #4. A lot of work for first thing in the morning, but a good set of work as in my experience developers use nested joins not intentionally but by accident, as they are not up to speed on ANSI joins and are used to doing joins in the WHERE clause. And when they move to ANSI they sometimes do things like this. So very practical for real life troubleshooting.

  • Hugo Kornelis (12/6/2011)


    honza.mf (12/6/2011)


    Probably I will use right joins and nested joins only in case of self-defense. I prefer the readability and I think from left to right.

    Absolutely true. This question was intended to illustrate how a real-life problem sometimes forces you to choose the lesser of two evils. Both nested joins and right outer joins reduce readability and ease of understanding of a query - so pick your poison. I have been in this position a few times, and whatever I chose, I always felt it was not the optimal choice.

    Yes, if I need to choose a poison, I use some antidote. Here it can be anything that makes query more readable: right join with conditions near the table, parentheses around the nested join, CTE, or subquery. Some are better for hand made queries, some for generated code, it depends.



    See, understand, learn, try, use efficient
    © Dr.Plch

  • Fascinating question Hugo, thank you.

    I had to re-read 'Formatting Your FROM Clause' in the first volume of 'MVP Deep Dives' to answer it. Knowing where to find the answers is half of the battle.

Viewing 15 posts - 16 through 30 (of 43 total)

You must be logged in to reply to this topic. Login to reply