Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase

Server Puke, CPU Pressure, or Something else? Expand / Collapse
Author
Message
Posted Tuesday, April 23, 2013 10:36 AM
Right there with Babe

Right there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with Babe

Group: General Forum Members
Last Login: Friday, September 26, 2014 9:59 AM
Points: 727, Visits: 1,419
Hey everyone,

I am having a rather challenging issue that I am hoping to get some help on.

In the last 3 weeks, we have 3 instances where the server decided to take a puke and start stumbling along like a drunk. Slow moving, huge waits, etc. Currently, we haven't been able to figure out what the issues are. Sadly, restarting the issues resolves the issues for 5-7 days before it crops up again. What I do know is below. I just need help figuring out some different directions to look next or some really good idea to figure out what is going on.

Issues:

On 4/10, 4/17, and 4/22, the SQL 2005 SP4 64 instance simply starts to drag.
- No blocking
- Light waits on TempDB GAM/SGAM pages, but very minimal
- 60+ percent of total instance waits (once the issue starts to occur) are sos_scheduler_yeilds. Prior, it is 15-20%. During a single hour last night this reached 31k seconds of signal waits and over 7m task waiting on it.
- Sys.dm_os_schedulers go from a normal running value of 0-1 to 3-5.

Top 2 users of CPU per Confio come from the same TVF. Both are deletes. The TVF is heavily used, often being called 30K/hour

Server Specs and Config:

SQL 2005 SP4 Standard Ed 64 bit
Server 2008 R2
DAS Box with 2 SAS drives for OS and Backups
32 SSD’s for the remaining drives
128GB of ram with max server memory set to 100GB
2 x 6 core processors for a total of 24 threads with hyper threading enabled.
MAXDOP = 1
TempDB, Indexes, Data, and Logs are all on separate drives
Total of 14 TempDB files
Replicate (transactional) 95% of all reporting to reporting system. Only 1 large processes runs on the server, but was not running when the issues began.


From what I have read, when we start to have running_tasks from sys.dm_os_schedulers continually above 2-3, that it is a sign of CPU pressure. However, I don't see it on the CPU counters.

Regardless, anyone got any ideas on what might be going on or what to look at next?

Thanks,

Fraggle
Post #1445568
Posted Tuesday, April 23, 2013 11:30 AM
SSC Eights!

SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!

Group: General Forum Members
Last Login: Tuesday, September 23, 2014 10:34 AM
Points: 832, Visits: 1,177
I think you must already have checked all these before but just wanted to ask?

Have you checked the execution plan for long running sqls/ stored procedures? It may be good case for execution plans going bad.

Have you tried re-compiling Sproc ?
Are the stats are being updated?
How TVF is being used in query/SP?
Post #1445601
Posted Tuesday, April 23, 2013 12:10 PM
Right there with Babe

Right there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with Babe

Group: General Forum Members
Last Login: Friday, September 26, 2014 9:59 AM
Points: 727, Visits: 1,419
DevDB (4/23/2013)
I think you must already have checked all these before but just wanted to ask?

Have you checked the execution plan for long running sqls/ stored procedures? It may be good case for execution plans going bad.

Have you tried re-compiling Sproc ?
Are the stats are being updated?
How TVF is being used in query/SP?


Yes, we actually tried flushing the entire cache a couple of times to no effect.

Fraggle
Post #1445619
Posted Tuesday, April 23, 2013 12:48 PM
SSC Eights!

SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!

Group: General Forum Members
Last Login: Tuesday, September 23, 2014 10:34 AM
Points: 832, Visits: 1,177
Flushing the entire cache, I think is never a good idea. But what I was more interested in to see the execution plan plan for the slow running sql when it is running fast and when it is bad. Do you have anything to compare for good vs bad plan?

What I believe that when the query/ SP is running fast, is using parallel plan but in bad mode it is using the serial plan. Which will cause the query to run very- slow.

Multi- statement TVF may cause the query to use serial plan.
So if you could share some more information, that will be helpful.
Post #1445639
Posted Tuesday, April 23, 2013 1:15 PM
Right there with Babe

Right there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with Babe

Group: General Forum Members
Last Login: Friday, September 26, 2014 9:59 AM
Points: 727, Visits: 1,419
DevDB (4/23/2013)
Flushing the entire cache, I think is never a good idea. But what I was more interested in to see the execution plan plan for the slow running sql when it is running fast and when it is bad. Do you have anything to compare for good vs bad plan?

What I believe that when the query/ SP is running fast, is using parallel plan but in bad mode it is using the serial plan. Which will cause the query to run very- slow.

Multi- statement TVF may cause the query to use serial plan.
So if you could share some more information, that will be helpful.


I have looked at the plans between the two and they are the same. As the MAXDOP = 1 for the server, parallelism isn't really a concern. The MAXDOP has been set to 1 for 12-18 months now, so it isn't a new change to the system.

If needed, I can get you the plans. Maybe you will see something that I am not. However, as TVF don't really show up well in execution plans, not sure how much value there will be.

Fraggle
Post #1445661
Posted Tuesday, April 23, 2013 1:25 PM


SSC-Insane

SSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-Insane

Group: General Forum Members
Last Login: Today @ 7:59 PM
Points: 20,702, Visits: 32,342
If you can show the code using the TVF and the code for the TVF would be helpful as well as the actual execution plans for both a fast and slow run of the query. As this appears to be on SQL Server 2005, please save and post the plans as .sqlplan files.



Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
Post #1445664
Posted Tuesday, April 23, 2013 1:47 PM
SSC Eights!

SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!

Group: General Forum Members
Last Login: Tuesday, September 23, 2014 10:34 AM
Points: 832, Visits: 1,177
Yes you are right. For Optimizer TVF is like black box.
Simple Single statement TVF can be in lined by SQL but for multi line TVF SQL Server will repeatedly
execute the function – for every row in the result set. Which will cause the CPU to spike a lot.

Have you tried replacing the TVF with inline SQL query logic?

Also one of the issues with functions that they don't get good stats from optimizer and sql server has a no way to know at compile time how much data it is going to return and it works on assumption that only one row is going to return from function.

Have you tried seeing the execution plan just for TVF for different kind of parameter to see if that will help?
Post #1445668
« Prev Topic | Next Topic »

Add to briefcase

Permissions Expand / Collapse