Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase

Very interested to know what can be done in the case of a long running select query Expand / Collapse
Author
Message
Posted Saturday, April 20, 2013 4:21 PM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Wednesday, July 16, 2014 10:31 AM
Points: 298, Visits: 904
Hi everyone. I'd love to get some help understanding why SQL Server 2008R2 x64 Enterprise Edition does what it does.

The SQL Server is dedicated to an application that as I understand it processes workloads which take 5-6 days to complete and then begins processing the next workload.

Throughout the week, there are constant inserts/updates/deletes that take place in different SPIDs. During the weekly workload there is a single select query that stays active for almost the entire time. This query does nothing that I can interperate as database activity but holds several locks which prevent me from doing maintenance etc on the SQL Server.

The SQL Server has 48 cores and 128GB RAM. We have MDOP set on this instance to 32. Tried MDOP settings of 0, 1, 6, 8, 16, and 32 but have not determined what setting would work best. If I recall, there were a lot of CXPACKET waits when MDOP was set to 0. Parallism threshhold has always stayed at the default value of 5.

Subtree cost in the execution plan for this log running select query started at 9000+ but with the addition of a few indexes and even an indexed view, we have reduced the cost to ~90. This seems like a big win but the query still stays open for days with no apparent database activity. The common wait type for the query is ASYNC_NETWORK_IO which lasts for about 2 second on average so I believe that the bottleneck may be the application slowly taking in the data collected innitialy by the long running select query. Does that make sense?

I wonder if there is anything we can do on SQL Server's side to improve things. My thought is that since SQL Server is parallelizing the query to 32 threads and then just sitting there for days, it is really not taking advantage of the parallel threads but instead blocking other operations from using them. Could a plan guide that forces MDOP of 1 for this specific query free up the other threads?

Mainly, is there anything I can do to prevent the query from staying open for days?

Please see the below output collected by Spotlight on SQL Server for the long running spid. Is is only a moment in time but it is pretty typical of what I see all week. Wish I knew how to format it better so the header rows lined up with the output.


SPID Status Blocked By Last Command Memory CPU ms/s I/O per sec Logical Reads Reads Writes Total CPU Total I/O Total Logical Reads Total Reads Total Writes Current Wait Time (ms) Last Wait Type Last Wait Resource Last Batch Time Time Since Last Batch Tran Count Host Process Net Library Login Time
93 running 0 SELECT 2 0 0 0 0 0 247,635 137,435 1,265,971 137,434 1 1,230 ASYNC_NETWORK_IO 4/18/2013 17:00 1d 21:47:51 0 3672 TCP 4/18/2013 16:59
93 running 0 SELECT 2 0 0 0 0 0 247,635 137,435 1,265,971 137,434 1 1,230 ASYNC_NETWORK_IO 4/18/2013 17:00 1d 21:47:51 0 3672 TCP 4/18/2013 16:59
93 running 0 SELECT 2 0 0 0 0 0 247,635 137,435 1,265,971 137,434 1 1,230 ASYNC_NETWORK_IO 4/18/2013 17:00 1d 21:47:51 0 3672 TCP 4/18/2013 16:59
93 running 0 SELECT 2 0 0 0 0 0 247,635 137,435 1,265,971 137,434 1 1,230 ASYNC_NETWORK_IO 4/18/2013 17:00 1d 21:47:51 0 3672 TCP 4/18/2013 16:59
93 running 0 SELECT 2 0 0 0 0 0 247,635 137,435 1,265,971 137,434 1 1,230 ASYNC_NETWORK_IO 4/18/2013 17:00 1d 21:47:51 0 3672 TCP 4/18/2013 16:59
93 running 0 SELECT 2 0 0 0 0 0 247,635 137,435 1,265,971 137,434 1 1,230 ASYNC_NETWORK_IO 4/18/2013 17:00 1d 21:47:51 0 3672 TCP 4/18/2013 16:59
93 running 0 SELECT 2 0 0 0 0 0 247,635 137,435 1,265,971 137,434 1 1,230 ASYNC_NETWORK_IO 4/18/2013 17:00 1d 21:47:51 0 3672 TCP 4/18/2013 16:59
93 running 0 SELECT 2 0 0 0 0 0 247,635 137,435 1,265,971 137,434 1 1,230 ASYNC_NETWORK_IO 4/18/2013 17:00 1d 21:47:51 0 3672 TCP 4/18/2013 16:59
93 running 0 SELECT 2 0 0 0 0 0 247,635 137,435 1,265,971 137,434 1 1,230 ASYNC_NETWORK_IO 4/18/2013 17:00 1d 21:47:51 0 3672 TCP 4/18/2013 16:59
93 running 0 SELECT 2 0 0 0 0 0 247,635 137,435 1,265,971 137,434 1 1,230 ASYNC_NETWORK_IO 4/18/2013 17:00 1d 21:47:51 0 3672 TCP 4/18/2013 16:59
93 running 0 SELECT 2 0 0 0 0 0 247,635 137,435 1,265,971 137,434 1 1,230 ASYNC_NETWORK_IO 4/18/2013 17:00 1d 21:47:51 0 3672 TCP 4/18/2013 16:59
93 running 0 SELECT 2 0 0 0 0 0 247,635 137,435 1,265,971 137,434 1 1,230 ASYNC_NETWORK_IO 4/18/2013 17:00 1d 21:47:51 0 3672 TCP 4/18/2013 16:59
93 running 0 SELECT 2 0 0 0 0 0 247,635 137,435 1,265,971 137,434 1 1,230 ASYNC_NETWORK_IO 4/18/2013 17:00 1d 21:47:51 0 3672 TCP 4/18/2013 16:59
93 running 0 SELECT 2 0 0 0 0 0 247,635 137,435 1,265,971 137,434 1 1,230 ASYNC_NETWORK_IO 4/18/2013 17:00 1d 21:47:51 0 3672 TCP 4/18/2013 16:59
93 running 0 SELECT 2 0 0 0 0 0 247,635 137,435 1,265,971 137,434 1 1,230 ASYNC_NETWORK_IO 4/18/2013 17:00 1d 21:47:51 0 3672 TCP 4/18/2013 16:59
93 running 0 SELECT 2 0 0 0 0 0 247,635 137,435 1,265,971 137,434 1 1,230 ASYNC_NETWORK_IO 4/18/2013 17:00 1d 21:47:51 0 3672 TCP 4/18/2013 16:59
93 running 0 SELECT 2 0 0 0 0 0 247,635 137,435 1,265,971 137,434 1 1,230 ASYNC_NETWORK_IO 4/18/2013 17:00 1d 21:47:51 0 3672 TCP 4/18/2013 16:59
93 running 0 SELECT 2 0 0 0 0 0 247,635 137,435 1,265,971 137,434 1 1,230 ASYNC_NETWORK_IO 4/18/2013 17:00 1d 21:47:51 0 3672 TCP 4/18/2013 16:59
93 running 0 SELECT 2 0 0 0 0 0 247,635 137,435 1,265,971 137,434 1 1,230 ASYNC_NETWORK_IO 4/18/2013 17:00 1d 21:47:51 0 3672 TCP 4/18/2013 16:59
93 running 0 SELECT 2 0 0 0 0 0 247,635 137,435 1,265,971 137,434 1 1,230 ASYNC_NETWORK_IO 4/18/2013 17:00 1d 21:47:51 0 3672 TCP 4/18/2013 16:59
93 running 0 SELECT 2 0 0 0 0 0 247,635 137,435 1,265,971 137,434 1 1,230 ASYNC_NETWORK_IO 4/18/2013 17:00 1d 21:47:51 0 3672 TCP 4/18/2013 16:59
93 running 0 SELECT 2 0 0 0 0 0 247,635 137,435 1,265,971 137,434 1 1,230 ASYNC_NETWORK_IO 4/18/2013 17:00 1d 21:47:51 0 3672 TCP 4/18/2013 16:59
93 running 0 SELECT 2 0 0 0 0 0 247,635 137,435 1,265,971 137,434 1 1,230 ASYNC_NETWORK_IO 4/18/2013 17:00 1d 21:47:51 0 3672 TCP 4/18/2013 16:59
93 running 0 SELECT 2 0 0 0 0 0 247,635 137,435 1,265,971 137,434 1 1,230 ASYNC_NETWORK_IO 4/18/2013 17:00 1d 21:47:51 0 3672 TCP 4/18/2013 16:59
93 running 0 SELECT 2 0 0 0 0 0 247,635 137,435 1,265,971 137,434 1 1,230 ASYNC_NETWORK_IO 4/18/2013 17:00 1d 21:47:51 0 3672 TCP 4/18/2013 16:59
93 running 0 SELECT 2 0 0 0 0 0 247,635 137,435 1,265,971 137,434 1 1,230 ASYNC_NETWORK_IO 4/18/2013 17:00 1d 21:47:51 0 3672 TCP 4/18/2013 16:59
93 running 0 SELECT 2 0 0 0 0 0 247,635 137,435 1,265,971 137,434 1 1,230 ASYNC_NETWORK_IO 4/18/2013 17:00 1d 21:47:51 0 3672 TCP 4/18/2013 16:59
93 running 0 SELECT 2 0 0 0 0 0 247,635 137,435 1,265,971 137,434 1 1,230 ASYNC_NETWORK_IO 4/18/2013 17:00 1d 21:47:51 0 3672 TCP 4/18/2013 16:59
93 running 0 SELECT 2 0 0 0 0 0 247,635 137,435 1,265,971 137,434 1 1,230 ASYNC_NETWORK_IO 4/18/2013 17:00 1d 21:47:51 0 3672 TCP 4/18/2013 16:59
93 running 0 SELECT 2 0 0 0 0 0 247,635 137,435 1,265,971 137,434 1 1,230 ASYNC_NETWORK_IO 4/18/2013 17:00 1d 21:47:51 0 3672 TCP 4/18/2013 16:59
93 running 0 SELECT 2 0 0 0 0 0 247,635 137,435 1,265,971 137,434 1 1,230 ASYNC_NETWORK_IO 4/18/2013 17:00 1d 21:47:51 0 3672 TCP 4/18/2013 16:59
93 running 0 SELECT 2 0 0 0 0 0 247,635 137,435 1,265,971 137,434 1 1,230 ASYNC_NETWORK_IO 4/18/2013 17:00 1d 21:47:51 0 3672 TCP 4/18/2013 16:59
93 running 0 SELECT 2 0 0 0 0 0 247,635 137,435 1,265,971 137,434 1 1,230 ASYNC_NETWORK_IO 4/18/2013 17:00 1d 21:47:51 0 3672 TCP 4/18/2013 16:59




Thanks for reading and any insight.

Howard
Post #1444753
Posted Saturday, April 20, 2013 4:31 PM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Monday, July 14, 2014 6:07 AM
Points: 182, Visits: 343
I guess Hyper Treading is enabled on your server. Microsoft recommends to set maxdop=number of physical processors when HT is enabled. However they also recommend to monitor and test it properly to identify what maxdop value suits to your environment. details http://support.microsoft.com/kb/322385
Post #1444754
Posted Saturday, April 20, 2013 4:40 PM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Wednesday, July 16, 2014 10:31 AM
Points: 298, Visits: 904
Mainly I am interested in understanding why a select query would stay open and running for almost an entire week. CPU use rarely if ever goes above 5 to 10%. The SQL Server is definitely busy doing something but I don't know what.

The select query joins tables that contain hundreds of millions of rows. Adding non-clustered indexes have made the query just about as optimal as I can get it.

The whole thing wouldn't bother me that much other than extreme curiosity but for the fact that index maintenance always causes blocking due to locks that are held all week. I spend a good deal of time trying to squeeze performance out of this database server because the application team is very frustrated with the overall performance of the application and I want to feel certain that there is not an optimization on the database that I am missing.
Post #1444755
Posted Saturday, April 20, 2013 5:12 PM


SSC-Forever

SSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-Forever

Group: General Forum Members
Last Login: Today @ 1:14 AM
Points: 42,412, Visits: 35,479
Would need to look at the exec plan to say anything useful. Could be a bad execution plan for one reason or another and the query is just running and running.

Async network IO is usually an application problem, maybe take this up with whoever developed the system that is running the query.



Gail Shaw
Microsoft Certified Master: SQL Server 2008, MVP
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Post #1444759
Posted Saturday, April 20, 2013 5:16 PM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Wednesday, July 16, 2014 10:31 AM
Points: 298, Visits: 904
Thanks much. What could I provide that would be useful? The XML of the execution plan? Product has been discontinued and the vendor is no longer actively supporting it as they are encouraging us to start over again with their new archival product.
Post #1444760
Posted Sunday, April 21, 2013 12:04 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Thursday, April 17, 2014 1:41 AM
Points: 170, Visits: 1,400
With no specifics we're all shooting in the dark here, but if it stays 'processing' for days and you have no obvious way to tell what it's doing, it may be worth checking for blocking as well.

Set the blocking process report with sp_configure. A value of 30 should be fine (checking for blocking every 30 secs). Then set a server-side trace to capture the blocking events. If after a few days there is nothing in the trace you will have eliminated blocking as a culprit. If you do get fish in the net, you'll know what holds you back.


Cheers,

JohnA

MCM: SQL2008
Post #1444777
Posted Sunday, April 21, 2013 12:12 AM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Wednesday, July 16, 2014 10:31 AM
Points: 298, Visits: 904
Thanks for responding John. There is no blocking going on. If I try to run index maintenance on an index that belongs to a table in the long-running select query, it will immediately cause blocking of my maintenance spid and all other processes that involve that table. Once I kill the spid of my index maintenance, all blocking clears up immediately.

Really, I don't understand why a select query would stay in a running state for several days.

I don't get anything useful from this SQL Profiler as the statement start was generated when the query began and the statement ending doesn't occur until several days later.
Post #1444778
Posted Sunday, April 21, 2013 12:31 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Thursday, April 17, 2014 1:41 AM
Points: 170, Visits: 1,400
Also worth taking snapshots of waiting tasks during these runs. It could bring up resource names that force everything else to wait for.

If there are many active spids but just one is under scrutiny - Paul Randal had a script to use XEvents and hone into a spid for a few seconds and see all waits incurred during those seconds. It's easy to capture 10s of thousands of events per second, but it is useful for a 'slice' of the query's activity.


Cheers,

JohnA

MCM: SQL2008
Post #1444780
Posted Sunday, April 21, 2013 3:36 AM


SSC-Forever

SSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-Forever

Group: General Forum Members
Last Login: Today @ 1:14 AM
Points: 42,412, Visits: 35,479
PHXHoward (4/20/2013)
Thanks much. What could I provide that would be useful? The XML of the execution plan?


Table definitions, index definitions and execution plan, as per http://www.sqlservercentral.com/articles/SQLServerCentral/66909/



Gail Shaw
Microsoft Certified Master: SQL Server 2008, MVP
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Post #1444784
Posted Monday, April 22, 2013 10:07 PM


SSCarpal Tunnel

SSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal Tunnel

Group: General Forum Members
Last Login: Yesterday @ 9:19 AM
Points: 4,317, Visits: 6,110
Table variable? User Defined Function? Up-to-date stats? CTE(s)? Improper datatypes? Incomplete join or valid join that has Cartesian? UDFs and Cartesians are my best guess. Will need to see tables, query statement and at least estimated plan though.

Best,

Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru at GMail
Post #1445263
« Prev Topic | Next Topic »

Add to briefcase

Permissions Expand / Collapse