RE: Improving Performance of Cross-Partition Queries

SSC Guru

Points: 150467

May 5, 2010 at 1:21 pm

Solomon Rutzky (5/5/2010)
Hey Paul. Actually, I had not had time to test the CROSS APPLY approach and was answering in terms of my solution against the simple WHERE clause solutions. I did just test your example and the result was rather interesting in that it depends on which metric you go by.

Yes. As I said in my reply to TheSQLGuru, the difference will be probably impossible to show with only ten iterations - I was responding to your point about the WHILE loop being the 'best performing solution'. It's not a huge deal at all - but I'm going to go on about it a bit anyway, on a point of principle 🙂

If you go by query cost (based on Actual Execution Plan), then the CROSS APPLY + $PARTITION is slightly faster coming in at 0.0387787 as opposed to my solution which has a total cost of 0.1361422 (which is not a huge difference but still one definitely costs less).

Be careful with this. The query cost is an internal metric used by the optimiser simply as a way of comparing plan alternatives - you only ever get the optimiser's cost estimate - regardless of whether you view an 'estimated' plan or an 'actual' plan. This surprises many people. SQL Server never shows an 'actual' cost since there is no meaningful way to compute it.

If you go by STATISTICS IO and STATISTICS TIME, then my solution is slightly faster.

I would expect the logical reads to be exactly the same - as you later show - because essentially the same operation is taking place. My issue with the WHILE loop is that it executes once per partition, whereas the set-based code does it all in one step. As I said, the difference is likely immeasurable (at least repeatably) since the number of iterations is so small, but even so...

Now, as far as dynamic goes, the solution I presented here was simplified for the example. In our systems where I work we have a dynamic approach to get the highest PartitionID since we store those along with the CustomerIDs. However, I do appreciate that your approach (and maybe also that of Ben-Gan) is dynamic in more situations than mine since my approach assumes a non-sparse list of numbers (or something sequential). For my purposes that is all it will ever be but it is good for people to have another approach if their data is not as predictable (or in my situation if we skipped PartitionID 11 and went to 12--which we won't do but that would expose the issue).

Understood. Thanks for clarifying and finding a way to make the tests compare apples with apples.

Paul

Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi