Improving Performance of Cross-Partition Queries

  • Solomon Rutzky (5/5/2010)


    Hey Paul. Actually, I had not had time to test the CROSS APPLY approach and was answering in terms of my solution against the simple WHERE clause solutions. I did just test your example and the result was rather interesting in that it depends on which metric you go by.

    Yes. As I said in my reply to TheSQLGuru, the difference will be probably impossible to show with only ten iterations - I was responding to your point about the WHILE loop being the 'best performing solution'. It's not a huge deal at all - but I'm going to go on about it a bit anyway, on a point of principle 🙂

    If you go by query cost (based on Actual Execution Plan), then the CROSS APPLY + $PARTITION is slightly faster coming in at 0.0387787 as opposed to my solution which has a total cost of 0.1361422 (which is not a huge difference but still one definitely costs less).

    Be careful with this. The query cost is an internal metric used by the optimiser simply as a way of comparing plan alternatives - you only ever get the optimiser's cost estimate - regardless of whether you view an 'estimated' plan or an 'actual' plan. This surprises many people. SQL Server never shows an 'actual' cost since there is no meaningful way to compute it.

    If you go by STATISTICS IO and STATISTICS TIME, then my solution is slightly faster.

    I would expect the logical reads to be exactly the same - as you later show - because essentially the same operation is taking place. My issue with the WHILE loop is that it executes once per partition, whereas the set-based code does it all in one step. As I said, the difference is likely immeasurable (at least repeatably) since the number of iterations is so small, but even so...

    Now, as far as dynamic goes, the solution I presented here was simplified for the example. In our systems where I work we have a dynamic approach to get the highest PartitionID since we store those along with the CustomerIDs. However, I do appreciate that your approach (and maybe also that of Ben-Gan) is dynamic in more situations than mine since my approach assumes a non-sparse list of numbers (or something sequential). For my purposes that is all it will ever be but it is good for people to have another approach if their data is not as predictable (or in my situation if we skipped PartitionID 11 and went to 12--which we won't do but that would expose the issue).

    Understood. Thanks for clarifying and finding a way to make the tests compare apples with apples.

    Paul

  • TheSQLGuru (5/5/2010)


    Well, the BEST answer is for MS to fix the damn optimizer to work properly with partitions in ALL ASPECTS!! 🙂

    Oh I agree - but to be fair to the optimiser, the main problem in 2005 was the design choice made to implement the partition selection using - yes - a correlated loop join. The optimiser has no choice, sadly 🙁

    SQL Server 2008 eliminates the problem using a far better design (partition as hidden leading column of every index), and as far as I know, most people* are happy with the very performant partitioning in 2008 and later.

    One of my favourite things about the 2008 implementation is that it is now possible to perform an 'index seek' on a heap!

    But they are too busy building stuff that no one (or maybe 0.03% of installed base) will EVER use . . . sigh . . . :angry:

    Tell me about it. I haven't even touched every new 'feature' of 2008 yet and R2 is out!

    edit: *except Tim 😛

  • Hi Paul,

    To be honest, the performance aspect of 2008 isn't much (if any) better than 2005.

  • chapman.tim (5/5/2010)


    Hi Paul, To be honest, the performance aspect of 2008 isn't much (if any) better than 2005.

    Edited my post to reflect your opinion 🙂

    So, I was thinking of the thread-per-partition (selected partitions > 1) parallelism issue as discussed here:

    http://technet.microsoft.com/en-us/magazine/2008.04.dwperformance.aspx

    Genuinely interested to hear more about your experiences.

    Paul

  • I wrote an article for SQL Mag not too long ago (sadly hasn't published yet) about the differences between the two versions. Have also had several talks w/ the PM for partitioning @ MS about the issues I've seen (issues that Solomon mentioned today about having to do the partition lookup). From the scenarios I looked at, they're exactly the same. The constant scan was removed from the 2008 version of the query plan, but the effect is still the same.

    One thing I didn't look into was parallelism. The scenario I was interested in was for an OLTP environment, so parallel queries are something that I was trying to avoid when I could. 2008 does allow for additional threads to satisfy queries across partitions in 2008, but it comes at a cost. This idea could be (and likely is) great for an OLAP environment where you have heave queries that are going to eat up some resources. Throwing all of your available schedulers at the query works great for it. But, it could come back to bite you if it were to happen to you in an OLTP system because it would be possible that available threads are going to go try to satisfy the one query while everything else is holding.

  • Hell, I just installed my first copy of SQL 2008 (NOT R2) on my laptop just two months ago!! I have never had a version of 2008 R2 on anything. Maybe 2% of my client's servers are 2008 (NOT R2) boxes, with probably 20% still on SQL 2000. Anyway, enough on the partitioning sucks bandwagon for me. 🙂

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Tim,

    Yes the MIN/MAX rewrite rule is still broken in 2008 that is true - don't know anyone with R2 to test it there though.

    The parallelism thing really made partitioning just about useless on 2005 for non-OLTP stuff. All very sad.

    Never yet encountered partitioning on a super-high-volume OLTP system - the ones I've encountered used local partitioned views instead, for a variety of reasons.

    Paul

  • Paul White NZ (5/5/2010)


    My issue with the WHILE loop is that it executes once per partition, whereas the set-based code does it all in one step. As I said, the difference is likely immeasurable (at least repeatably) since the number of iterations is so small, but even so...

    Paul

    Hey Paul. Generally speaking I do try to first look for set-based approaches and I did try a few but did not know about CROSS APPLY and $PARTITION so I ended up with this, which really isn't so bad given the immeasurable difference.

    However, I did find this to be interesting. Someone in an earlier post asked if I had tried putting PartitionID as the left-most field in the Clustered Index. I tried that and found no difference in the result of the regular MIN(IntDate) over the table. Since I had that table still lying around and thought that the newly built indexes might show a slight difference I tried both approaches against that table. My approach used slightly few reads than it did against the table I had been using to test against. But then I tried the CROSS APPLY + $PARTITION approach and it took 1 minute and 30 seconds. Checking the stats it caused 1,262,939 logical reads and took 52,328 ms CPU Time and 89,744 ms elapsed time. I would have thought that the performance would have been, at the very least, equal to the WHILE loop.

    Take care,

    Solomon...

    SQL#https://SQLsharp.com/ ( SQLCLR library ofover 340 Functions and Procedures)
    Sql Quantum Lifthttps://SqlQuantumLift.com/ ( company )
    Sql Quantum Leaphttps://SqlQuantumLeap.com/ ( blog )
    Info sitesCollations     •     Module Signing     •     SQLCLR

  • Solomon Rutzky (5/6/2010)


    But then I tried the CROSS APPLY + $PARTITION approach and it took 1 minute and 30 seconds. Checking the stats it caused 1,262,939 logical reads and took 52,328 ms CPU Time and 89,744 ms elapsed time. I would have thought that the performance would have been, at the very least, equal to the WHILE loop.

    Yes it should be. Can't see your data or what you did, so hard to advise 😉

  • this author has no idea about execution plan. He just checked statistical IO and did not analyse the execution plan.

    Select min(field1) from partitionedtable1

    The above statement on a partitioned table does not have any conditions and 100% will cause a clustered index scan on all partitions. That is why it is slow. What you need to do is to add condition to the query and also add appropriate index on field1(field1 can be in an index with multiple columns or included columns. The only thing you need to know is that field1 should be the first field in the index column list). The condition can be something like that field1 > -999999999(one value you know it is less than all field1 values).

    In another word,

    select min(field1)

    from partitionedtable1

    where field1 > -9999999999

    Now, if you check the execution plan, it might become either index seek or clustered index seek(no appropriate index on field1) on all partitions. It should be much faster than before. Although this might still be slower than non-partitioned tables. Howver, when you write you query, if you can help optimzer to determine which partitions will be used at compiling time, you can achieve better query performance than non-partitioned tables.

  • fat wallet (6/29/2010)


    this author has no idea about execution plan. He just checked statistical IO and did not analyse the execution plan.

    Select min(field1) from partitionedtable1

    The above statement on a partitioned table does not have any conditions and 100% will cause a clustered index scan on all partitions. That is why it is slow. What you need to do is to add condition to the query and also add appropriate index on field1

    ...

    Hello. Thanks for commenting but I did check the execution plan (as is noted in the article by the cost estimates). And the suggestion to add an index was given by Paul in Comment #2. However, in the context of my particular situation (and I assume for others as well), the cost of adding an index (in terms of space used and degradation of DML operations for several 90+ million row tables) is not worth it compared to adding this (or the CROSS APPLY) logic to the three Procs that make use of the MIN() / MAX() functions.

    SQL#https://SQLsharp.com/ ( SQLCLR library ofover 340 Functions and Procedures)
    Sql Quantum Lifthttps://SqlQuantumLift.com/ ( company )
    Sql Quantum Leaphttps://SqlQuantumLeap.com/ ( blog )
    Info sitesCollations     •     Module Signing     •     SQLCLR

Viewing 11 posts - 31 through 40 (of 40 total)

You must be logged in to reply to this topic. Login to reply