Dear David,
Thanks for detailed explanation – I use your article as reference for all these topics. Unfortunately unlike your conclusion for EXPECT case I found situation where NOT IN gives dramatically worst performance:
I try to compare local and remote table. For both cases EXCEPT and LEFT JOIN it gives roughly the same performance. See here:
[EXCEPT]
SELECT ac.Article_Code
FROM relAC ac
EXCEPT
SELECT Code
FROM [SRV-DCI-PRAHA].[DCI].[dciowner].Products
[font="Courier New"]
Local table: Scanning a clustered index, entirely or only a range.
Remote Table: Send a SQL query to another than the current SQL Server.
Remote Table: Compute new values from existing values in a row.
Both: Match rows from two suitably sorted input tables exploiting their sort order.[/font]
Cost = 48,5442
SELECT ac.Article_Code
FROM relAC AS ac
LEFT JOIN [SRV-DCI-PRAHA].[DCI].[dciowner].Products AS pr
ON ac.Article_Code = pr.Code
WHERE pr.Code IS NULL
[font="Courier New"]
Local table: Scanning a clustered index, entirely or only a range.
Remote Table: Send a SQL query to another than the current SQL Server.
Remote Table: Compute new values from existing values in a row.
Both: Match rows from two suitably sorted input tables exploiting their sort order
Both: Restricting the set of rows based on a predicate.[/font]
Cost = 48,6391
[NOT IN]
Unfortunately using of NOT IN generates local temporary table for better performance for rewinds and generate Nested Loops for each row of local table for comparison of each row in temporary table. These loops dramatically impair performance of actual query. See Execution Plan here:
SELECT ac.Article_Code
FROM relAC ac
WHERE ac.Article_Code NOT IN (
SELECT Code
FROM [SRV-DCI-PRAHA].[DCI].[dciowner].Products
[font="Courier New"]
Local table: Scanning a clustered index, entirely or only a range.
Remote Table: Send a SQL query to another than the current SQL Server.
Remote Table: Compute new values from existing values in a row.[/font]
[font="Courier New"]Remote Table: Stores the data from the input into a temporary table in order to optimize rewinds.
Both: For each row in the top (outer) input, scan the bottom (inner) input, and output matching rows.[/font]
Cost = 77826,3!!
Petr
__________________________________
Standing on the shoulders of giants ...