RE: xml in subselect and performance => alternatives?

GSquared

SSC Guru

Points: 260824

December 3, 2012 at 9:39 am

#1564954

With that same structure, try this:

SELECT *

FROM dbo.Parcel AS P

INNER JOIN dbo.ParcelDetail AS PD

ON P.ParcelId = PD.ParcelId

WHERE P.ParcelId BETWEEN 10 AND 20;

You'll still get a Nested Loops Join.

Has nothing to do with it being a sub-select. Has to do with data distributions.

The second table, in this sample, has between 2 and 7 rows per ParcelID value. The most efficient way to handle that, in this case, is a Nested Loops Join.

You might be able to force this into a Merge Join if the indexes are right and the data volume is high enough, but odds are SQL Server would pick a Merge Join if it would actually be more efficient.

Your test script also produces multiple identical ParcelID values in the Parcel table. Does that parallel your actual environment, or is ParcelID unique in the table that is meant to parallel? If it isn't unique, that will also complicate the join. When I deduped that table and set the clustered index to Unique, the execution plan remains the same, but the execution time on a 10k-row query went from 452 ms CPU time and 1.217 seconds total time, down to 200 ms CPU and 700 ms total. Almost twice as fast. Of course, it returns less rows (10k vs 18k) once the primary table is deduped, which accounts for some of the speed-up, but some of it is also the stats on the table are treated differently with a unique index than with a non-unique one.

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon