Often, when DBAs are looking at total wait statistics on the servers they manage, CXPacket is at or near the top of the list. If this is the case for you as well, chances are your server isn’t on fire, but there could be room for improvement.
What Does It Mean?
Books Online defines the CXPacket wait type as: “occurs with parallel query plans when trying to synchronize the query processor exchange iterator. If waiting is excessive and cannot be reduced by tuning the query (such as as adding indexes), consider adjusting the cost threshold for parallelism or lowering the degree of parallelism”.
The first part “occurs with parallel query plans when trying to synchronize the query processor exchange iterator”, is a fancy way of saying that sometimes queries need to use more than one thread to get their work done. When a query goes from serial to parallel, you’ll probably see some CXPacket waits.
The second part of the definition is where things get interesting. Microsoft is suggesting that first to look at tuning the query before making instance-wide configuration changes. While I agree with this if you’re only looking at trying to fix a single query, most of the time when you’re trying to bring down your CXPacket wait times, you’re trying to achieve this at an instance level. That’s where two instances configurations, ‘cost threshold for parallelism’ and ‘max degree of parallelism’ come in.
Cost Threshold for Parallelism
For each query ran, SQL Server determines an associated cost. What this cost maps to in terms of measurable units is a bit of a mystery. Officially, it’s the “measure of seconds required to run the serial plan based on a specific hardware configuration”. Who knows what this configuration is, but unless you were able to reproduce the environment identically in terms of network, servers and storage – it doesn’t really matter anyway. The important thing to remember is at some point, SQL Server determines a tipping point for when a query goes parallel or stays serial. This tipping point is determined by the ‘cost threshold for parallelism’ setting. By default, this is set to 5. Unless all of the activity against your SQL Server are low millisecond queries queries that have the optimal indexes available to them (I can hear you laughing now), 5 is probably too low. So where to start? Brent Ozar’s group recommends starting with 50 and adjusting as appropriate.
Max Degree of Parallelism (MAXDOP)
Your MAXDOP setting determines the maximum number of processors available for query execution. The default value is 0 – which means SQL Server will automatically determine the maximum number of processors available, up to 64. While having all processors available for query execution may sound like a good thing on the surface, at a certain point the overhead in scheduling and keeping track of all of the worker threads can adversely affect the run time of your query. So where is the sweet spot? Microsoft has a KB article detailing these recommendations, but in short, they recommend for servers with 8 or more proocessors, set MAXDOP to 8. For servers with less than 8, the recommendation is to stick with the default.
An important thing to remember regarding MAXDOP is that although it determines the number of processors available, this is not the same as the number of worker threads. Often, there is confusion after setting MAXDOP, and then looking at sys.sysprocesses and seeing more than 8 threads pounding away at your query. The number of processors available are per operator. So, assuming a MAXDOP of 8, each join, sort, scan, seek, etc can have up to 8 worker threads, resulting in many more threads than the MAXDOP setting for the entire query. A detailed explanation of this scenario is can be found here.