SQL Clone
SQLServerCentral is supported by Redgate
Log in  ::  Register  ::  Not logged in

Cluster Update KB2710487 CPU Increase

Cluster Update KB2710487 CPU Increase

SSC Rookie
SSC Rookie (47 reputation)SSC Rookie (47 reputation)SSC Rookie (47 reputation)SSC Rookie (47 reputation)SSC Rookie (47 reputation)SSC Rookie (47 reputation)SSC Rookie (47 reputation)SSC Rookie (47 reputation)

Group: General Forum Members
Points: 47 Visits: 174
This is an interesting one.

Windows 2008r2SP1 EE
SQL 2008r2SP2CU1 EE

We have installed Cluster Service Update KB 2710487, which is supposed to fix inexplicable Cluster Service crashes with error 1359.

"The cluster service encountered an unexpected problem and will be shut down. The error code was '1359'"

The problem is that the Update increases CPU consumption a great deal. We've identified some interesting behavior:
• Install the patch on one Node, reboot the Node. CPU is normal (0) until you move a cluster group to that node
• With each Cluster Group Move a 1.25% increase in CPU is added to the Node, and it persists. Forever. Note that these are dedicated Nodes, and SQL Groups are complete inactive (no users, no databases)
• This CPU increase persists even after all Cluster Groups are moved off the Node
• 3 processes clearly account for the total CPU increase. When comparing these processes to non-updated hosts we found each of their CPU usage has clearly increased, and each are in constant use. On non-patched hosts all 3 processes have little CPU usage, and are in use infrequently. Here's some metrics:

Process CPU Frequency
------------------ --------- ------------------------------
Clussvc 1 Frequent but not steady
lssas 0 infrequent
Wmiprvse 0 infrequent
1 Total Avg CPU

Process CPU Frequency
------------------ --------- ------------------------------
Clussvc 10 Constant
lssas 2 Constant
Wmiprvse 1 Constant
13 Total Avg CPU

• When a workload is placed on the Cluster Node, CPU consumption of these 3 processes increases. The worst we're seeing is a 15% CPU increase on each node, where each Node is hosting 4 active SQL Groups. The inactive Node in this Cluster has a steady 5% CPU load. It is hosting 4 SQL Groups with activity.
• CPU usage patterns by these 3 processes on Updated Nodes are the same regardless of Node CPU and Memory resources

MS tried to argue that this was not a significant increase... riiiiiiiiight. So after days and days of data collection and arguing with support, they've finally agreed this is a bug, and it will be fixed.

Unfortunately this is as far as we've gotten. We have no root cause, or time frame for the fix. We badly need to fix this 1359 error, but we can't afford a 15% increase in CPU.

One thing we've found is that Clusters with Updated Nodes have overly verbose Heartbeat chatter:

Cluster Update Status Packet/Sec
--------------------------- ----------------
0 of 5 nodes Updated 2572
1 of 2 nodes Updated 5795
2 of 2 nodes Updated 12008
4 of 4 nodes Updated 18865

Have any of you seen this issue or have thoughts on HB chattiness being a root cause for CPU load increase?


You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum