
In my previous blog post, I discussed the SQL Server wait stats. One of possible wait stats is of the type ‘CX_PACKET’. This particular wait type can lead to a bit of confusion.
From the whitepaper “SQL Server 2005 Waits and Queues”: “Occurs when trying to synchronize the query processor exchange iterator. Consider lowering the degree of parallelism if contention on this wait type becomes a problem.”
If one would read no further, he would probably lower the degree of parallelism, as I have seen advised on plenty of blogs. But if we continue reading:
“Parallel process waits can sometimes occur when data is skewed. In such cases, one parallel thread may process a larger number of rows while another may process a smaller number of rows and so on.
Basically, CXPACKET means that a query has been split up in sub-tasks, each executed by a different thread, where one or more threads are ready and waiting for the remaining thread(s) to finish. During the time the sub-tasks were executing, multiple cpu’s were occupied, which could lead to a reduction of throughput of other queries. But I believe the CXPacket wait time itself does not consume any resources.
If there aren’t that many other queries to process, CXPacket waits should therefore not be a problem. In a data warehouse environment, CXPACKET waits are in fact expected for multiple proc environments.
Ordinarily a well tuned OLTP application would not use parallelism unless an index is missing, there is an incomplete WHERE clause, or the query is not a true OLTP transaction. So before you start lowering the degree of parallelism, look for the type of transaction that is causing it, and in case of an OLTP transaction, see whether you can optimize the query so it does not exceed the cost of parallelism threshold. In case of a OLAP transaction, see if you can further divide the longest running sub-task in even more sub tasks, for example by partitioning your tables, so the amount of data to be processed for each subtask is less skewed. See Partitioned Tables, Parallelism & Performance considerations for a good blog post about this.
Now, so far for the theory, as there have been multiple issues with parallelism in the past.
For example, in SQL2000, parallel queries could cause an undetected deadlock with itself and run forever.
Another example in SQL 2005: FIX: You may receive an access violation error message when you run a parallel execution plan in SQL Server 2005.
Maybe the most convincing post that I’ve seen that parallelism is not always functioning as it should be are the posts from Jimmay May: Case Study: Part 2: CXPACKET Wait Stats & ‘max degree of parallelism’ Option: Suppressing Query Parallelism Eliminated CXPACKET Waits & Liberated 30% of CPU.
So while the about theory still holds, it might indeed be worth doing a “quick” check whether you are experiencing a parallelism ‘bug’ by altering the MAXDOP setting. But I would still try to look at the individual queries for the final solution.
If you are planning to alter the MAXDOP, it’s important to know that in SQL 2008, there is an issue with the MAXDOP setting: FIX: The MAXDOP option for a running query or the max degree of parallelism option for the sp_configure stored procedure does not work in SQL Server 2008.
What also important to know, is that MAXDOP is always specified per operator in the execution plan and not per execution plan, as explained in WHY DO I SEE MORE THREADS PER SPID IN SYSPROCESSES THAN MAXDOP?
Recommendations from the whitepaper:
Check for parallelism: sp_Configure “max degree of parallelism”.
If max degree of parallelism = 0, you might want to use one of the following options:
1. turn off parallelism completely for OLTP workloads: set max degree of parallelism to 1
1. limit parallelism by setting max degree of parallelism to some number less than the total number of CPUs. For example if you have 8 processors, set max degree of parallelism to <=4.”
My recommendation: MAXDOP can be set at two levels; system-wide or per query. If you set the system-wide setting to a low number, you might want to specify a higher maxdop setting in your maintenance scripts (such as index rebuild’s). Otherwise they might run a lot longer than before.