Click here to monitor SSC
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


SQL Trace causes application to hang


SQL Trace causes application to hang

Author
Message
adb2303
adb2303
SSC Eights!
SSC Eights! (962 reputation)SSC Eights! (962 reputation)SSC Eights! (962 reputation)SSC Eights! (962 reputation)SSC Eights! (962 reputation)SSC Eights! (962 reputation)SSC Eights! (962 reputation)SSC Eights! (962 reputation)

Group: General Forum Members
Points: 962 Visits: 2886
Hi,

We've recently upgraded to Windows 2008 R2, SQL Server 2008 R2 SP + CU4, multi-instance cluster. 32 cores (64 with h/t), 384GB per node. Maxdop 4 server wide, 320GB max server memory. It used to be Win2k3 R2, SQL Server 2005 SP3, 2 x single instance clusters - 8 cores, 96GB RAM, Maxdop 0 server wide.

We have 6 databases on this server, the main one being an Enterprise Data Management Platform that munges data from lots of different sources. It's a 3rd party product that has something like 18000 encrypted stored procedures, 12000 tables, 3000 views. Good, eh ;-). There are triggers on a lot of these tables that send insert/update/deletes to a local staging database, then on to a repository database across a linked server.

For the past few years we have been using a SQL monitoring product (name witheld to protect the innocent) against out production cluster (multi-instance) without any problems. However, on the day we upgraded, when we turned on a particular feature that monitors the types of queries being executed against the server, some data load went from completing in 15 mins, to taking 14 hrs to run. When we turned off this feature, it returned to normal. No other databases appear to be affected.

We contacted the vendor of the monitoring software and found out in essense that this feature runs a SQL trace capturing the following events: SQL:BatchCompleted, SQL:StmtCompleted, SP:Completed for events with a run duration of greater than 5000ms. Sure enough, just running the trace against the server whilst processing data had exactly the same effect as using the query monitoring feature of the software. So, I think we say that although the monitoring software causes the problem, it does so only by leveraging a feature native to SQL Server.

The EDM vendor was aware of the conflict between their product and the monitoring software but have been unable to explain the cause, or reproduce it in their environment. They've also tried against a version of the DB that doesn't have any encrypted procedures (I don't have that copy).

During the slow down, I query sys.dm_exec_requests and all of the sessions associated with the hung jobs have a RUNNABLE status with a 'last_wait_type' of SOS_SCHEDULER_YIELD. There's nothing in sys.dm_os_waiting_tasks. No evident blocking. CPU's are completely idle. I tried changing MAXDOP settings, affinity, hyperthreading, etc. as well (to make it more like the old system), but no effect. As soon as I stop the trace, everything resumes at normal speeds and runs through to completion.

These are the top 10 waitstats:


wait_type wait_time_ms signal_wait_time_ms resource_wait_time_ms percent_total_waits percent_total_signal_waits percent_total_resource_waits
PREEMPTIVE_OS_WRITEFILEGATHER 4729935 0 8553549 0.308682089780738 0.000000000000000 0.308682089780738
BACKUPTHREAD 4721076 2466 8620446 0.311185274808784 0.001188989695061 0.311096281335619
ASYNC_IO_COMPLETION 1543967 532 18797472 0.678385914247604 0.000256505481659 0.678366715331252
OLEDB 1019783 0 93509304 3.374581401551842 0.000000000000000 3.374581401551842
LCK_M_IX 192111 707 458001 0.016553940830775 0.000340882284837 0.016528426481413
MSQL_DQ 167309 0 2099902 0.075781659483654 0.000000000000000 0.075781659483654
LCK_M_IS 94128 69 1338334 0.048300539928959 0.000033268568109 0.048298049843943
LCK_M_U 86708 316 270195 0.009762251994894 0.000152360398880 0.009750848127286
LCK_M_S 77223 1699 245245 0.008911761653415 0.000819178220563 0.008850447820931
LCK_M_X 61582 1207 241601 0.008762500905236 0.000581958865344 0.008718942461557




Any ideas what else I might check? Like I said, this used to work fine under SQL 2005/Win2k3, with the same version of the monitoring software.

Thanks,
Andy
adb2303
adb2303
SSC Eights!
SSC Eights! (962 reputation)SSC Eights! (962 reputation)SSC Eights! (962 reputation)SSC Eights! (962 reputation)SSC Eights! (962 reputation)SSC Eights! (962 reputation)SSC Eights! (962 reputation)SSC Eights! (962 reputation)

Group: General Forum Members
Points: 962 Visits: 2886
looks specifically to be sp:Completed that's causing the problems. Any ideas?
Perry Whittle
Perry Whittle
SSCrazy Eights
SSCrazy Eights (8.8K reputation)SSCrazy Eights (8.8K reputation)SSCrazy Eights (8.8K reputation)SSCrazy Eights (8.8K reputation)SSCrazy Eights (8.8K reputation)SSCrazy Eights (8.8K reputation)SSCrazy Eights (8.8K reputation)SSCrazy Eights (8.8K reputation)

Group: General Forum Members
Points: 8826 Visits: 16562
adb2303 (11/22/2012)
Windows 2008 R2, SQL Server 2008 R2 SP + CU4, multi-instance cluster.

w00t

adb2303 (11/22/2012)
32 cores (64 with h/t),

w00t

adb2303 (11/22/2012)
384GB per node.

w00t


adb2303 (11/22/2012)
Maxdop 4 server wide

??
Why dont you just pull all the CPUs back out except for the one socket in that case.

Why have you installed 32 cores and then put a choker around SQL Servers neck?

-----------------------------------------------------------------------------------------------------------

"Ya can't make an omelette without breaking just a few eggs" ;-)
adb2303
adb2303
SSC Eights!
SSC Eights! (962 reputation)SSC Eights! (962 reputation)SSC Eights! (962 reputation)SSC Eights! (962 reputation)SSC Eights! (962 reputation)SSC Eights! (962 reputation)SSC Eights! (962 reputation)SSC Eights! (962 reputation)

Group: General Forum Members
Points: 962 Visits: 2886
weeks of testing demonstrated that MAXDOP 4 was most performant for our workload
Perry Whittle
Perry Whittle
SSCrazy Eights
SSCrazy Eights (8.8K reputation)SSCrazy Eights (8.8K reputation)SSCrazy Eights (8.8K reputation)SSCrazy Eights (8.8K reputation)SSCrazy Eights (8.8K reputation)SSCrazy Eights (8.8K reputation)SSCrazy Eights (8.8K reputation)SSCrazy Eights (8.8K reputation)

Group: General Forum Members
Points: 8826 Visits: 16562
you never considered raising the Cost threshold for parallelism instead?

-----------------------------------------------------------------------------------------------------------

"Ya can't make an omelette without breaking just a few eggs" ;-)
adb2303
adb2303
SSC Eights!
SSC Eights! (962 reputation)SSC Eights! (962 reputation)SSC Eights! (962 reputation)SSC Eights! (962 reputation)SSC Eights! (962 reputation)SSC Eights! (962 reputation)SSC Eights! (962 reputation)SSC Eights! (962 reputation)

Group: General Forum Members
Points: 962 Visits: 2886
yes. We still achieved more consistent results with setting maxdop.
adb2303
adb2303
SSC Eights!
SSC Eights! (962 reputation)SSC Eights! (962 reputation)SSC Eights! (962 reputation)SSC Eights! (962 reputation)SSC Eights! (962 reputation)SSC Eights! (962 reputation)SSC Eights! (962 reputation)SSC Eights! (962 reputation)

Group: General Forum Members
Points: 962 Visits: 2886
anyone seen this before? It looks specifically to be the SP:Completed event that causes the delays. I suspect it's to do with the encrypted procedures, but 2005 worked okay.
ESAT ERKEC
ESAT ERKEC
Old Hand
Old Hand (300 reputation)Old Hand (300 reputation)Old Hand (300 reputation)Old Hand (300 reputation)Old Hand (300 reputation)Old Hand (300 reputation)Old Hand (300 reputation)Old Hand (300 reputation)

Group: General Forum Members
Points: 300 Visits: 897
if ı were you Firtly ı make maxdop 1 and ı set automaticly set processor affinity
and automaticly set i/o affinity
adb2303
adb2303
SSC Eights!
SSC Eights! (962 reputation)SSC Eights! (962 reputation)SSC Eights! (962 reputation)SSC Eights! (962 reputation)SSC Eights! (962 reputation)SSC Eights! (962 reputation)SSC Eights! (962 reputation)SSC Eights! (962 reputation)

Group: General Forum Members
Points: 962 Visits: 2886
Hi,

I've already tried a variety of maxdop and affinitiy options, to no avail. I can reproduce this problem in our test environments, which only have 2 vCPUs...
ESAT ERKEC
ESAT ERKEC
Old Hand
Old Hand (300 reputation)Old Hand (300 reputation)Old Hand (300 reputation)Old Hand (300 reputation)Old Hand (300 reputation)Old Hand (300 reputation)Old Hand (300 reputation)Old Hand (300 reputation)

Group: General Forum Members
Points: 300 Visits: 897
I think it is related your encreptyted procedures
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search