July 10, 2015 at 4:13 am
Hi, I am fairly new to SQL Server Administration, so please excuse any ignorance.
I am experiencing application performance issues on one of my servers.
The company runs a business application from this server and the SQL Server instance is hosted on the same server.
During certain times of the day, the application gets SQL Server Timeout errors. This is occurring daily.
I have turned on the SQL Server Data Collector and have been running waitstats queries to try determine where the issue is lying. Both are telling me that the wait is ASYNC_NETWORK_IO.
I have run sp_who/who2 and I have seen that there are SPIDs that are in the status SUSPENDED. I either have to go to the user to close the application or kill the process myself, when I do that, the application runs fine again.
Some of the queries are running for minutes or even for hours. If I run the same queries myself through SSMS and they run in 1-2 seconds. I also run the reports through the application and I get very similar results, 1-2 seconds.
I know that ASYNC_NETWORK_IO is caused by either the application processing the data too slowly or by network issues.
If the application was processing the data to slowly, then I would experience the performance issue every time through the application, but this is not the case, it looks to be intermittent.
This only leaves a network issue...
My questions are:
1. What would cause a query to run fine most of the time and then almost indefinitely other times?
2. How do I go about diagnosing that it IS in fact a network issue, is there any monitoring that I can put on to prove my case?
I basically need to point a finger at someone, if its me, that fine, then I can fix it, if its the application, that can be fixed to. If its the network, then I need some solid proof that is in fact the network.
Any help or advice will be appreciated.
July 10, 2015 at 4:18 am
Is the query that is timing out doing so because of async network IO, or is that just a high wait on the server?
The sessions that are suspended, what are they waiting for?
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
July 10, 2015 at 4:28 am
Thanks for the quick reply.
I believe the application is timing out because of a long wait on SQL Server for a response. I think these suspended queries are blocking other queries, causing the timeouts.
I believe the developers left the standard 30 seconds limit to wait for results.
How do I determine what the suspended queries are waiting on? I found the queries using sp_who2.
July 10, 2015 at 4:32 am
1) stop using sp_who/2 and start using sp_whoisactive. AMAZING freebie from Adam Machanic found on SQLBlog.com
2) it is quite likely that you have a parameter sniffing issue, or other plan difference issue possibly caused by something such as different set options. This can be seen by capturing your query plan from ssms execution and the production query plan for same query/inputs. Look at things such as estimated/actual rows and parameter compiled/execution values
3)
My questions are:
1. What would cause a query to run fine most of the time and then almost indefinitely other times?
See 2 above
2. How do I go about diagnosing that it IS in fact a network issue, is there any monitoring that I can put on to prove my case?
There are numerous good free network monitoring options. Microsoft's NetMon and wireshark are two that come to mind. But I would look elsewhere for tuning opportunities first honestly.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
July 10, 2015 at 4:36 am
You need to confirm whether or not what you believe is true, before you start making changes. Otherwise you could waste a huge amount of time making ineffectual changes.
If you suspect blocking, check and confirm, is there blocking, is it where you expect, is it due to what you expect?
Bear in mind that a query which is SUSPENDED is waiting for something itself.
sp_who2 was written for SQL 2000, if not before, and is missing a lot of columns that have been added to the process-related tables since then. The wait type for the suspended queries should be in sp_who2, if not it's in the newer session and request DMVs
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
July 10, 2015 at 4:42 am
Thanks guys. I will give those suggestions a try and revert back later (the server is in the US, they are only waking up later:-))
July 10, 2015 at 4:45 am
For what it's worth, I don't think I've ever seen a SQL performance problem due to the network. I've seen general slow response from a cloud server, but that's general slow response, not a query that's sometimes fine and sometimes not.
Sometimes fine and sometimes not smells like parameter sniffing, blocking or statistics-related problems.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
July 11, 2015 at 5:38 am
Okay, so I have started using sp_WhoIsActive instead of sp_who/sp_who2, thanks for that, it really is an awesome tool.
I am now trying to determine what is causing the queries to become suspended (what are they waiting for).
I used a SQL Agent Job to periodically output the results of sp_WhoIsActive to a table, using the @destinationtable parameter. This has been working nicely.
I have now remotely queried the data from the table, via SSMS and its taking a very long time to return the results. The table has about 850 rows and after 20mins, it has only returned 488 rows to SSMS.
If I run sp_WhoIsActive right now on the server, it is returning 3 results. One for my query and 2 more for report queries from the business application.
The wait_info column is showing me this for the three queries:
(1163ms)ASYNC_NETWORK_IO - Application
(285ms)ASYNC_NETWORK_IO - Application
(55ms)ASYNC_NETWORK_IO - My Query
The two queries from the application have been running for over 12hrs...
I have extracted the queries from the sql_text column and run them in SSMS on the actual server, they return 500 rows in 1 second.
I have logged into the Business Application myself and run the same report they are running (the same query) and it gave me results in 3/4 seconds.
How do I now determine whether this is Blocking, Query Plan or statistics?
July 12, 2015 at 6:58 am
It can't be blocking, because sp_whoisactive would show that as the wait (and you would see a valud in the blocked column of the output).
There are SOO many things that could be at play here. I STRONGLY recommend you hire a performance tuning expert for a few hours to help you with this.
One thing you can do is a statement-level profiler trace, filtered to just the slow-running report queries. They could be firing stuff back and forth a lot for some weird reason.
You also need to capture and evaluate query plans and review a number of things on them.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
July 12, 2015 at 7:44 am
Excuse my ignorance, but the sp_WhoIsActive procedure is telling me that the wait type is ASYNC_NETWORK_IO. Does that not mean that the query as already been run and the data is sitting in network buffer waiting to be sent to the application? How would a better execution plan make the queries complete any faster?
July 12, 2015 at 9:21 pm
Because it could be that your calls are being done one row at a time instead of set based, so check plan/execution calls to see if something is looping instead of saying "give me all rows in one set". Things can get converted to server-side cursors or other RBAR in a heartbeat by a bunch of different connectivity mechanisms.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
July 13, 2015 at 7:36 am
The SQL query is a single select statement and the execution plan is not showing multiple queries, I don't see any evidence of looping or RBAR.
Viewing 12 posts - 1 through 11 (of 11 total)
You must be logged in to reply to this topic. Login to reply
This website stores cookies on your computer.
These cookies are used to improve your website experience and provide more personalized services to you, both on this website and through other media.
To find out more about the cookies we use, see our Privacy Policy