Jeffrey Williams wrote:
Look at what the resulting values are that come out of the hash match - in the first query the number of rows is less than what you see from the second query.
The difference in the number of rows isn't going to cause much of an issue - since the size of that data is less than a single page. The estimated data size for the 124,000+ rows is 3410KB vs 2200KB from the second query and since SQL Server reads by page - either will result in the same level of work (1 scan 681 logical reads it appears).
So - the hash match reduces the number of rows in the first plan to 625 which makes the sort less costly and the merge join less costly. Looking at the final select the memory grant for the first plan is 1712 vs 1776 for the second plan. Granted - the differences are minor but the first plan appears to be a bit more efficient even though it appears to read more rows from the index (again - it looks like all of those rows exist on a single page so they will all be loaded into memory regardless).
Removing that filter should have resulted in the exact same plan...did that happen or did you get something different from the other 2 plans?
I've been getting the same-looking plan all along - just different estimated and actual number of rows retrieved from P, and in the case of the first plan, much higher (cca 50%) than correct.
And what comes out of the hash match is 255 in both cases - the final and correct row count for the entire query (which makes sense - the last join to tL is a left join, and there are no conditions on that table).
The query runs quickly, though, in both versions, so it's not really a performance issue that I need to solve. But I could see this being a performance issue with similarly constructed queries in situations processing much more data, where badly incorrect row estimates might result in a seriously sub-optimal query plan being generated. So, I'm trying to figure out if there is something here that I can learn about performance tuning in general, like it's usually better to put inner joins first and left joins later, or something to that effect.
Testing large numbers of syntax variations can quickly get to be a major pain in the fundament, even when such testing is restricted to queries having noticeable performance problems. I'd like to have a better handle on what is likely to be a good syntax (and WHY a particular syntax is likely to be good or bad), instead of having to constantly assemble exhaustive series of tests.