Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase

Stuck tracking down CXPACKET issue Expand / Collapse
Author
Message
Posted Monday, November 4, 2013 6:42 AM
Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Friday, November 21, 2014 6:55 AM
Points: 336, Visits: 596
We have a server that often has the following happen:

1) It breaks a SPID down and assignes hundreds of threads to the SPID (not shown in sample below)
2) All threads for the SIPD go into CXPACKET wait for an extended amount of time

So we have two issues, how to keep a complex query from eating up all workers, and how to find out what is putting these queries on hold for extended amount of time (this one was on hold for 1560 seconds until I killed the SPID.


Spid ecid Age Seconds status wait_type
125 21 1560 suspended CXPACKET
125 22 1560 suspended CXPACKET
125 23 1560 suspended CXPACKET
125 24 1560 suspended CXPACKET
125 25 1560 suspended CXPACKET
125 26 1560 suspended CXPACKET
125 27 1560 suspended CXPACKET
125 28 1560 suspended CXPACKET
125 29 1560 suspended CXPACKET
125 30 1560 suspended CXPACKET
125 31 1560 suspended CXPACKET
125 32 1560 suspended CXPACKET
125 33 1560 suspended CXPACKET
125 34 1560 suspended CXPACKET
125 35 1560 suspended CXPACKET
125 36 1560 suspended CXPACKET
125 37 1560 suspended CXPACKET
125 38 1560 suspended CXPACKET
125 39 1560 suspended CXPACKET
125 40 1560 suspended CXPACKET
125 41 1560 suspended CXPACKET
125 42 1560 suspended CXPACKET
125 43 1560 suspended CXPACKET
125 44 1560 suspended CXPACKET
125 45 1560 suspended CXPACKET
125 46 1560 suspended CXPACKET
125 47 1560 suspended CXPACKET
125 48 1560 suspended CXPACKET
Post #1511078
Posted Monday, November 4, 2013 6:45 PM
Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: 2 days ago @ 2:16 PM
Points: 320, Visits: 966

You can set MAXDOP to half the number of cores on your server if the instance is the only instance on the server.
Fewer if more than one instance is on your server.
That will prevent processes from taking over your instance.
Post #1511339
Posted Tuesday, November 5, 2013 7:54 AM
Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Friday, November 21, 2014 6:55 AM
Points: 336, Visits: 596
Currently our Cost of THreshold for Parallelism is 5 and the Max Degree of Parallelism is 0.

We have 24 cores.

Does that change your suggestion any?
Post #1511497
Posted Tuesday, November 5, 2013 9:08 AM
Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: 2 days ago @ 2:16 PM
Points: 320, Visits: 966
You could set maxdop to 12 or 8 and see if that helps.
Post #1511528
Posted Tuesday, November 5, 2013 9:14 AM
Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: 2 days ago @ 2:16 PM
Points: 320, Visits: 966

Mind you, this is only a band-aid.

You really should identify the query that is causing the issue and tune it.
Post #1511530
Posted Thursday, November 7, 2013 6:56 AM


SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Today @ 1:11 AM
Points: 2,840, Visits: 3,976
dwilliscp (11/5/2013)
Currently our Cost of THreshold for Parallelism is 5
you also need to increase value of threshold to avoid early parallelism occurence


-------Bhuvnesh----------
I work only to learn Sql Server...though my company pays me for getting their stuff done
Post #1512241
Posted Thursday, November 7, 2013 7:02 AM
Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Friday, November 21, 2014 6:55 AM
Points: 336, Visits: 596
This morning when it failed there were only 7 SIPDs running on the server, we can subtract two... one was the base Trace and the second was database mail.

So that leaves just 5. My capturing of processes.. showing wait.. is a SQL statement. So how would I modify the following to capture the execution plan, without creating a trace to capture all execution plans? Note I have this running every 15min during the two hours that we have the most failures due to query timeout. Side note only the Jobs created parrallel processing.. 49 threads, 25 threads, and 49 threads. (Of course in this case I can get an estimated execution plan by selecting the job.)

SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
insert into zt_wait_Hist(
Capture_DT
, [Spid]
, ecid
, start_time
, [Age Seconds]
, nt_username
, [status]
, wait_type
, [Individual Query]
, [Parent Query]
, [program_name]
, Hostname
, nt_domain
)
SELECT
GETDATE() as Capture_DT
, er.session_Id AS [Spid]
, sp.ecid
, er.start_time
, DATEDIFF(SS,er.start_time,GETDATE()) as [Age Seconds]
, sp.nt_username
, er.status
, er.wait_type
, SUBSTRING (qt.text, (er.statement_start_offset/2) + 1,
((CASE WHEN er.statement_end_offset = -1
THEN LEN(CONVERT(NVARCHAR(MAX), qt.text)) * 2
ELSE er.statement_end_offset
END - er.statement_start_offset)/2) + 1) AS [Individual Query]
, qt.text AS [Parent Query]
, sp.program_name
, sp.Hostname
, sp.nt_domain

FROM sys.dm_exec_requests er
INNER JOIN sys.sysprocesses sp ON er.session_id = sp.spid
CROSS APPLY sys.dm_exec_sql_text(er.sql_handle)as qt
WHERE session_Id > 50
AND session_Id NOT IN (@@SPID)
ORDER BY session_Id, ecid
Post #1512245
Posted Thursday, November 7, 2013 7:23 AM
Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Friday, November 21, 2014 6:55 AM
Points: 336, Visits: 596
If I just take the Individual Query and run an estimated plan

Plan A: it does not show any suggested index's .. you start with a Clustered Index Scan (cost 74%), move to Sort (20%) > Parallelism (6%) > Select (0%)

This is the one that failed to execute in 20min.

No blocking showed up in the trace...

Estimated rows 28M, estimated cost 1,878, cashed plan 24b

All SPIDs of these SPIDs were executing Select statements.. the writes were going on in the SSAS cubes. (for the jobs) and the users were only doing selects.
Post #1512255
Posted Thursday, November 7, 2013 8:06 AM
Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Friday, November 21, 2014 6:55 AM
Points: 336, Visits: 596
You know ... I am begining to wonder if the problem lies in SSAS, I have passed this mornings information off to the team that handles our cubes.
Post #1512291
« Prev Topic | Next Topic »

Add to briefcase

Permissions Expand / Collapse