Max Degree of Parallelism (maxdop)

  • Our shop is mostly into warehousing; we receive large datasets (upwards of 1TB in 1 or 2 tables) every 2 months, load this in SQL, transform the data, and score a variety of models. Models are mostly a large number of CASE-statements that produce values that are added/subtracted. In short I/O is our bottleneck, not CPU-power. Our main server (SQL2005) is a 16CPU machine with plenty of RAM, 2 physical drives for TEMPDB, separate drives fro data and logs/indexes. The machine runs pretty well.

    The other day, I got to play with the maxdop function to reduce the number of processors that each SQL-statement can use. I was astonished by my findings and they appear to be somewhat counter intuitive. For example transform or scoring queries that run for >24hours under 16CPUs can run in a matter of hours when Maxdop is set to 4 or even 2.

    Apparently when SQL2005 is left to its own devices it does not operate at its peak. what is behind that? Any explanation or pointers towards explanations would be greatly appreciated.

  • Sometimes the cost of running queries over many processors is more expensive because SQL has to split the load across the processors and then gather the steams back up to present the data. My guess is that if you are not using the maxdop hint you will see a lot of CXPACKET waits which is a wait type due to parallelism. This would indicate that the finished threads are waiting on others to finish.

  • don't assume just because you see a cxpacket you have a problem - it's the wait time which is important.

    Without wishing to generalise too much - usually poorly formed or poorly optimised sql will take a parallel plan in an attempt to perform better, sadly it rarely does, typically table scans are good candiates for parallel ops, as you want the whole table it sort of makes sense, I usually find poor parallel performance shows a lack of good covering/secondary indexes.

    You can set the number of processors/cores per query - a good starting point is usually 50% of cores and then adjust accordingly. I've found in tests a 600% performance difference going from 1 to 16 cores and 100%+ gain going from 8 to 16 - so be careful in cutting back.

    Make sure if you have intel chips that you've disabled Hyperthreading - or at least test with it on and off - as HT can do some bad things.

    [font="Comic Sans MS"]The GrumpyOldDBA[/font]
    www.grumpyolddba.co.uk
    http://sqlblogcasts.com/blogs/grumpyolddba/

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply