Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase

High work_queue_count and THREADPOOL WAITS Expand / Collapse
Author
Message
Posted Thursday, December 30, 2010 7:40 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Yesterday @ 2:48 PM
Points: 3, Visits: 142
Hi

I am tuning the DB activity for an application that sends a high number of transactions in a very short space of time via a large number of threads.

When activity is highest I am seeing:

"SQL Server failed with error code 0xc0000000 to spawn a thread to process a new login or connection. Check the SQL Server error log and the Windows event logs for information about possible related problems."

and related connection errors on the application. Eventually the DB becomes unresponsive and I need to use the DAC to get a connection. Investigating DMVs I find that the highest WAIT stat is THREADPOOL (at 50% of resource WAITS) and SIGNAL WAITS are only < 8%

This led me to investigate sys.dm_os_schedulers, as it looked like possibe worker starvation, but I am struggling to understand the results I get:

'SELECT max_workers_count
FROM sys.dm_os_sys_info'

is 256

and yet

'SELECT scheduler_id ,
cpu_id ,
Status ,
is_online ,
is_idle ,
current_tasks_count ,
runnable_tasks_count ,
current_workers_count ,
active_workers_count ,
work_queue_count ,
pending_disk_io_count ,
load_factor
FROM sys.dm_os_schedulers
WHERE scheduler_id < 255
AND runnable_tasks_count > 0'

Shows only about 120 active workers, but 78 queued (see below)

0 255 VISIBLE ONLINE 1 0 97 9 61 61 36 0 188
1 255 VISIBLE ONLINE 1 0 100 20 58 55 42 2 189

in fact 'select count(0) from sys.dm_os_schedulers' is never more than 130

Can anyone explain why the DB seems to become starved of workers when only half of the workers are allocated?

Regards, Steve

This environment is a 2 vCPU VM running Windows Server 2003 and SQL 2005 SP3, it is running on a HOST with 4 cores.

Post #1040899
Posted Thursday, January 6, 2011 3:41 PM
SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Monday, June 9, 2014 6:02 AM
Points: 2,674, Visits: 697
well it's a bit of a guess but I'd say the combination of VMware and 2 cores just doesn't provide enough workers.
There's some things you could try: disable parallelism, enable fibres.

I'd actually suggest you try to test on a physical box and/or get more cpus.

The whole thing about performance is that it's rerely one golden bullit. For example if your transactions/batches can't finish quick enough then you'll get queuing/blocking etc. this could be that your transaction log drive can't write fast enough ( quite likely on VMware which I've found absolute rubbish under load ) what you might be seeing is the after effect rather than the cause - blocking will ramp up threads, serialisation of stored procedures ( serialised execution plans ) can cause queuing the same. You may not have enough memory, disks may be too slow .. really tricky to know for sure.


The GrumpyOldDBA
www.grumpyolddba.co.uk
http://sqlblogcasts.com/blogs/grumpyolddba/
Post #1044070
Posted Thursday, January 6, 2011 4:18 PM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Today @ 3:05 AM
Points: 6,136, Visits: 7,185
I would agree that turning down parallelism should help with the worker starvation.

One other thing about VM boxes that I recently ran into is that you will want to talk to the VM Administrator and make sure your drives are RAW. If they're not RAW, VM gets involved in every read and write to the drives, also slowing things down.



- Craig Farrell

Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

For better assistance in answering your questions | Forum Netiquette
For index/tuning help, follow these directions. |Tally Tables

Twitter: @AnyWayDBA
Post #1044078
Posted Sunday, March 3, 2013 11:31 PM


SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Monday, June 23, 2014 11:55 AM
Points: 1,708, Visits: 1,792
colin.Leversuch-Roberts (1/6/2011)
well it's a bit of a guess but I'd say the combination of VMware and 2 cores just doesn't provide enough workers.
There's some things you could try: disable parallelism, enable fibres.

I'd actually suggest you try to test on a physical box and/or get more cpus.

The whole thing about performance is that it's rerely one golden bullit. For example if your transactions/batches can't finish quick enough then you'll get queuing/blocking etc. this could be that your transaction log drive can't write fast enough ( quite likely on VMware which I've found absolute rubbish under load ) what you might be seeing is the after effect rather than the cause - blocking will ramp up threads, serialisation of stored procedures ( serialised execution plans ) can cause queuing the same. You may not have enough memory, disks may be too slow .. really tricky to know for sure.


I know that this is a really old post at this point, but I happened upon it while doing some research and I have to say that the above information is absolute rubish, even at the time that this actually posted. SQL Server under VMware scales just fine, as far back as ESX 3.0 circa the year 2000 when things are done correctly. There are a number of markers in the data presented that would point to other issues that are non-VM related here that would have required further investigation to determine the root cause of the issue here.

To start off with, this is SQL Server 2005, do the CPUID value of 255 for each scheduler tells that the schedulers are affinity masked in SQL Server to specific CPUs in the system. The 2003 nature would make me assume that this is 32 bit so you have a max worker count of 255, which is purely stipulation for the data presented, but that would make the worker thread pool saturation make a lot more sense here.

I am not trying to stir up a debate about past information here, but I think there is too little information in this thread to determine that the VM aspect is overall a problem here, and it is unlikely that this could ever be substatiated at this point. The reason for this comment is simply that I don't want an incorrect assumption to be made based on the merits of this discussion on more modern and improved virtualization platforms, and it's important to take a note of the overall lake of substantiating information that would lead to applying credibility to the information above. A lot more information would be required to actually ascertain that VM overhead was an issue here.

It is incredibly unfortunate, but I've seen a lot of old threads, like this one, that have been referenced recently for why SQL Server doesn't scale under VM configurations, so I wanted to leave this note here based on that experience. At this point, I doubt seriously that this will ever be something that gets another look from the original poster since it is so old, but something for future readers to consider. Technology changes so rapidly that you have to gauge what was said in the past against current technology and understand.


Jonathan Kehayias | Principal Consultant | MCM: SQL Server 2008
My Blog | Twitter | MVP Profile
Training | Consulting | Become a SQLskills Insider
Troubleshooting SQL Server: A Guide for Accidental DBAs
Post #1426094
« Prev Topic | Next Topic »

Add to briefcase

Permissions Expand / Collapse