SubQuery

  • Hugo Kornelis (1/11/2013)


    I actually have a suggestion on Connect to add this feature - feel free to add your vote if you agree. Link: https://connect.microsoft.com/SQLServer/feedback/details/299231/add-support-for-ansi-standard-row-value-constructors

    Done.

    Obviously not high priority as they said they were considering it over 5 years ago.

    Someone said "Having used these in ORACLE over EIGHT YEARS AGO it's stunning that this has been on the very distant back-burner for OVER FIVE YEARS."

    I used these in Oracle over TWENTY years ago!

  • Thanks Hugo for the, as usual, enlightening post. When I read your posts, I am reminded of a quote from an MIT mathematics professor turned singing comedian (only in America?) who said:

    "It's a sobering thought to realize that, when Mozart was my age, he'd already been dead for three years. Kind of makes you feel like you haven't done anything."

    [font="Verdana"]Please don't go. The drones need you. They look up to you.[/font]
    Connect to me on LinkedIn

  • Thanks Hugo for a great post on the differences.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • Thanks for an easy one



    --Mark Tassin
    MCITP - SQL Server DBA
    Proud member of the Anti-RBAR alliance.
    For help with Performance click this link[/url]
    For tips on how to post your problems[/url]

  • Hugo Kornelis (1/11/2013)


    P.S. The article by Pinal Dave that is referenced only compares anti-semi-join with NOT IN, not with NOT EXISTS; they are not always equivalent (when NULL values can be involved). And he then bases his conclusions on the percentages in the execution plan output; these are estimated costs, not actual costs, and they can be very wrong - so this is one of the worst methods for determining if a query is faster or not.

    Gail Shaw on the other hand wrote several articles about the performance differences or lack thereof between IN/EXISTS/JOIN/NOT IN/NOT EXISTS and about beat the subject like a dead horse.

    She then later schooled me with said articles. 🙂

    Exists vs In is part of it, but there's several articles (which I presently can't find) that go into more depth than I normally would want... except reading Gail's stuff is generally enjoyable and you always learn something new.

    The only one I can find is Exists vs In, but she's done a bunch more.

    http://sqlinthewild.co.za/index.php/2009/08/17/exists-vs-in/



    --Mark Tassin
    MCITP - SQL Server DBA
    Proud member of the Anti-RBAR alliance.
    For help with Performance click this link[/url]
    For tips on how to post your problems[/url]

  • Wanted to briefly put in my 2 cents for EXCEPT. The tables don't have to be the same. The columns being EXCEPTed have to be compatible between the tables.

    EXCEPTs are very usefull animals. But as their primary benefit is to discover discrepancies between two tables [and then do something about them], I have found the MERGE statement to be even more useful. I've been surprised at how few database professionals actually understand this surprisingly powerful statement. I deal a lot with bi-temporal tables (What did I know and when did I know it). As such, I never delete a row from my tables. But I will update how long a fact was valid, and then Insert a new row with the new valid fact. The MERGE statement does a terrific job of sorting my rows into 3+ groups that are then dealt with easily and quickly.

    One of these days I hope to write an article about how I actually use the MERGE statement to deal with Bi-Temporal tables that are extremely useful for the decision support and the financial worlds (not to mention the Business Intelligence world.)

  • As one who avoids joins in most cases I answered with what I used and missed. Good question and thanks.

    Not all gray hairs are Dinosaurs!

  • Easy question for Friday. Thanks.

  • Nice..

    +1

  • Thanks for the question and thanks to Hugo for his elaborate explanation 🙂

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • Thanks for the great question and all the excellent comments.

  • I think this is a question with a very wrong answer. Even wronger than Hugo suggests.

    Hugo Kornelis (1/11/2013)


    I don't really agree with the answer marked as correct.

    NOT IN can only be used in limited cases. The comparison condition must be single-column, and there must not be any null values in the comparison column in the subquery. Because of those limitations, I never use NOT IN for this kind of questions - I prefer to use the same method in all cases.

    I agree with that much - that's one of the errors.

    NOT EXISTS and the outer join with IS NULL test can both be used under all circumstances, but I far prefer the NOT EXISTS method, because it is far easier to write, and (even more important!) to understand when you have to revisit your code later. In most cases, the execution plan will be the same, so there is no performance benefit of one over the other (this was not always true in older versions of SQL Server, but that has long been changed).

    Unfortunately, that outer join with IS NULL test can be used everywhere NOT EXISTS will work depends on the existence of columns which are not nullable; that is of course guaranteed in the relational model, but not in SQL: a a table every one of whose columns is nullable is permissable. The IS NULL test can't distinguish the case where there is no corresponding row from the case where there is a row in whch every column is NULL. The NOT EXISTS method can make that distinction.

    So the technically correct answer to the question would be that only NOT EXISTS and outer join with IS NULL test (a so called anti-semi-join) can always be used. And I would add that I recommend sticking to the NOT EXISTS method for maintainability, unless you happen to run into a situation where the execution plans are not equal, and the plan with the anti-semi-join performs better - and even in that case, I would still prefer the NOT EXISTS version, unless the performance difference prevents me from achieving the required response times.

    No, the technically correct answer is that only the NOT EXISTS method is correct, unless you are lucky enough to be in an environment a littlemore like the relational model that SQL is.

    Tom

Viewing 12 posts - 16 through 26 (of 26 total)

You must be logged in to reply to this topic. Login to reply