RE: Minus in SQLServer – SQLServerCentral

One Orange Chip

Points: 25041

August 19, 2005 at 6:18 am

Ok, I have now checked what it says in Inside SQL Server 2000, and I do not make the same interpretation as you do. I cannot find anywhere where she says that "Usually a join is more efficient than the alternative". The exact words are these:

Depending on your data and indexes, the outer-join formulation might be faster or slower than a correlated subquery. But before deciding to write your query one way or the other, you might want to come up with a couple of alternative formulations and then choose the one that's fastest in your situation.

As I said in my previous post it is unfortunate that different queries can produce the same results but not perform equally well, but since that is the case we should always, like I (and Kalen) said, test our queries and use the best one. In the case presented in this thread I constantly get better results using a correlated subquery than when using a left join filtering for null, even when changing the amount of data and/or indexes. Both queries use very similar execution plans, with exactly the same amount of I/Os, but the left join uses a filter condition that adds a little cpu activity. Hardly noticeable, but still clearly readable in the execution plan and/or statistics time output.

What Kalen say regarding "fewer than half the number of logical I/Os" is that in the example that she provides in the book, that is the result. However, what is interesting is that a) her example is very similar to the one here and b) when I try her exact example I do not at all get "fewer than half the number of logical I/Os". In fact I get the exact same result as I do with the example in this thread, that is the exact same number of logical I/Os. I can not be really sure what she used for testing though, but it looks like the pubs database, so I can not see where she got those numbers from. Perhaps it was a flaw in the RTM engine that has later been fixed in a service pack.

Finally, another interesting thing to try is to use the EXCEPT keyword in SQL Server 2005. Like I said above, EXCEPT performs a relational minus. Now, the product is still in beta, but when I execute the two statements above as well as one using EXCEPT two of the three execution plans are the same. Guess which one differs, and is also the slower one? As expected EXCEPT uses the fastest execution plan to perform a relational minus on the two sets, and that execution plan happens to be the same one as the one used by the query with the correlated subquery.

--
Chris Hedgate http://www.hedgate.net/
Contributor to the Best of SQL Server Central volumes
Articles: http://www.sqlservercentral.com/columnists/chedgate/