SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Server Puke, CPU Pressure, or Something else?


Server Puke, CPU Pressure, or Something else?

Author
Message
Fraggle-805517
Fraggle-805517
SSCrazy
SSCrazy (2.2K reputation)SSCrazy (2.2K reputation)SSCrazy (2.2K reputation)SSCrazy (2.2K reputation)SSCrazy (2.2K reputation)SSCrazy (2.2K reputation)SSCrazy (2.2K reputation)SSCrazy (2.2K reputation)

Group: General Forum Members
Points: 2243 Visits: 1512
Hey everyone,

I am having a rather challenging issue that I am hoping to get some help on.

In the last 3 weeks, we have 3 instances where the server decided to take a puke and start stumbling along like a drunk. Slow moving, huge waits, etc. Currently, we haven't been able to figure out what the issues are. Sadly, restarting the issues resolves the issues for 5-7 days before it crops up again. What I do know is below. I just need help figuring out some different directions to look next or some really good idea to figure out what is going on.

Issues:

On 4/10, 4/17, and 4/22, the SQL 2005 SP4 64 instance simply starts to drag.
- No blocking
- Light waits on TempDB GAM/SGAM pages, but very minimal
- 60+ percent of total instance waits (once the issue starts to occur) are sos_scheduler_yeilds. Prior, it is 15-20%. During a single hour last night this reached 31k seconds of signal waits and over 7m task waiting on it.
- Sys.dm_os_schedulers go from a normal running value of 0-1 to 3-5.

Top 2 users of CPU per Confio come from the same TVF. Both are deletes. The TVF is heavily used, often being called 30K/hour

Server Specs and Config:

SQL 2005 SP4 Standard Ed 64 bit
Server 2008 R2
DAS Box with 2 SAS drives for OS and Backups
32 SSD’s for the remaining drives
128GB of ram with max server memory set to 100GB
2 x 6 core processors for a total of 24 threads with hyper threading enabled.
MAXDOP = 1
TempDB, Indexes, Data, and Logs are all on separate drives
Total of 14 TempDB files
Replicate (transactional) 95% of all reporting to reporting system. Only 1 large processes runs on the server, but was not running when the issues began.


From what I have read, when we start to have running_tasks from sys.dm_os_schedulers continually above 2-3, that it is a sign of CPU pressure. However, I don't see it on the CPU counters.

Regardless, anyone got any ideas on what might be going on or what to look at next?

Thanks,

Fraggle
Neeraj Dwivedi
Neeraj Dwivedi
SSCommitted
SSCommitted (1.9K reputation)SSCommitted (1.9K reputation)SSCommitted (1.9K reputation)SSCommitted (1.9K reputation)SSCommitted (1.9K reputation)SSCommitted (1.9K reputation)SSCommitted (1.9K reputation)SSCommitted (1.9K reputation)

Group: General Forum Members
Points: 1919 Visits: 1341
I think you must already have checked all these before but just wanted to ask?

Have you checked the execution plan for long running sqls/ stored procedures? It may be good case for execution plans going bad.

Have you tried re-compiling Sproc ?
Are the stats are being updated?
How TVF is being used in query/SP?
Fraggle-805517
Fraggle-805517
SSCrazy
SSCrazy (2.2K reputation)SSCrazy (2.2K reputation)SSCrazy (2.2K reputation)SSCrazy (2.2K reputation)SSCrazy (2.2K reputation)SSCrazy (2.2K reputation)SSCrazy (2.2K reputation)SSCrazy (2.2K reputation)

Group: General Forum Members
Points: 2243 Visits: 1512
DevDB (4/23/2013)
I think you must already have checked all these before but just wanted to ask?

Have you checked the execution plan for long running sqls/ stored procedures? It may be good case for execution plans going bad.

Have you tried re-compiling Sproc ?
Are the stats are being updated?
How TVF is being used in query/SP?


Yes, we actually tried flushing the entire cache a couple of times to no effect.

Fraggle
Neeraj Dwivedi
Neeraj Dwivedi
SSCommitted
SSCommitted (1.9K reputation)SSCommitted (1.9K reputation)SSCommitted (1.9K reputation)SSCommitted (1.9K reputation)SSCommitted (1.9K reputation)SSCommitted (1.9K reputation)SSCommitted (1.9K reputation)SSCommitted (1.9K reputation)

Group: General Forum Members
Points: 1919 Visits: 1341
Flushing the entire cache, I think is never a good idea. But what I was more interested in to see the execution plan plan for the slow running sql when it is running fast and when it is bad. Do you have anything to compare for good vs bad plan?

What I believe that when the query/ SP is running fast, is using parallel plan but in bad mode it is using the serial plan. Which will cause the query to run very- slow.

Multi- statement TVF may cause the query to use serial plan.
So if you could share some more information, that will be helpful.
Fraggle-805517
Fraggle-805517
SSCrazy
SSCrazy (2.2K reputation)SSCrazy (2.2K reputation)SSCrazy (2.2K reputation)SSCrazy (2.2K reputation)SSCrazy (2.2K reputation)SSCrazy (2.2K reputation)SSCrazy (2.2K reputation)SSCrazy (2.2K reputation)

Group: General Forum Members
Points: 2243 Visits: 1512
DevDB (4/23/2013)
Flushing the entire cache, I think is never a good idea. But what I was more interested in to see the execution plan plan for the slow running sql when it is running fast and when it is bad. Do you have anything to compare for good vs bad plan?

What I believe that when the query/ SP is running fast, is using parallel plan but in bad mode it is using the serial plan. Which will cause the query to run very- slow.

Multi- statement TVF may cause the query to use serial plan.
So if you could share some more information, that will be helpful.


I have looked at the plans between the two and they are the same. As the MAXDOP = 1 for the server, parallelism isn't really a concern. The MAXDOP has been set to 1 for 12-18 months now, so it isn't a new change to the system.

If needed, I can get you the plans. Maybe you will see something that I am not. However, as TVF don't really show up well in execution plans, not sure how much value there will be.

Fraggle
Lynn Pettis
Lynn Pettis
SSC Guru
SSC Guru (89K reputation)SSC Guru (89K reputation)SSC Guru (89K reputation)SSC Guru (89K reputation)SSC Guru (89K reputation)SSC Guru (89K reputation)SSC Guru (89K reputation)SSC Guru (89K reputation)

Group: General Forum Members
Points: 89602 Visits: 38932
If you can show the code using the TVF and the code for the TVF would be helpful as well as the actual execution plans for both a fast and slow run of the query. As this appears to be on SQL Server 2005, please save and post the plans as .sqlplan files.

Cool
Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
Neeraj Dwivedi
Neeraj Dwivedi
SSCommitted
SSCommitted (1.9K reputation)SSCommitted (1.9K reputation)SSCommitted (1.9K reputation)SSCommitted (1.9K reputation)SSCommitted (1.9K reputation)SSCommitted (1.9K reputation)SSCommitted (1.9K reputation)SSCommitted (1.9K reputation)

Group: General Forum Members
Points: 1919 Visits: 1341
Yes you are right. For Optimizer TVF is like black box.
Simple Single statement TVF can be in lined by SQL but for multi line TVF SQL Server will repeatedly
execute the function – for every row in the result set. Which will cause the CPU to spike a lot.

Have you tried replacing the TVF with inline SQL query logic?

Also one of the issues with functions that they don't get good stats from optimizer and sql server has a no way to know at compile time how much data it is going to return and it works on assumption that only one row is going to return from function.

Have you tried seeing the execution plan just for TVF for different kind of parameter to see if that will help?
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search