Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 12345»»»

Communications to SQL server "freeze", then resume. Expand / Collapse
Author
Message
Posted Wednesday, September 12, 2012 11:23 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Monday, March 31, 2014 3:05 PM
Points: 175, Visits: 449
I am looking to see if anyone else has a few ideas of where to start in looking for problems or possible look for something I missed.

Problem: a few times per day, our internal traffic to our SQL server will "freeze" and our web interface and VB 6 windows application will also "freeze". No external web surfing or IP phone communications are affected. The "freeze" happens for about 2 to 5 minutes in length. Every workstation affected at the same time, for the same duration. The problem occurs AROUND 15 minute time intervals, but typically does not start directly on the mark but instead a minute or two away. Example, 9:00:50, 9:17:10,9:47:05,10:00:30. These times have been consistent the past 3 days. When the "freeze" is over, everything just resumes as though nothing happened.

Anyone have things that I should look towards as a suggestion, it would be quite appreciated. Below are a few things I have been looking into with no evidence of what is causing the problem.

I have Confio Ignite 8 and working with their support and don't really see anything other than for these "freezing" times, hardware data and sometimes SQL statement information gathering stops. There is nothing recorded and Ignite sows chunks of missing data. When it can pick up some data, there is nothing showing strain on the server.

I have perfmon counters set for CPU, Memory, and Disk. During the "freezing", I see no anomalies or large strain. I see nothing to indicate the server is physically struggling.

I have captured sp_who2, sp_who2 active, and custom "running SPIDS" query result data during these "freezes" and see nothing hitting the server hard, I do not see a strain on the server. I see a normal amount of open connections, nothing being dropped, and a normal amount of running connections and normal activity. Nothing putting strain on the server. There is no blocking. There is a very small number of transactions in queue.

I have checked the SQL server error log and seen nothing of value for this case.

I even went to the pain of running a profiler against production during the times I knew there would be a "freeze" - to pull ANY errors from SQL server. Nothing but some log completion events, no errors.

I have switched some SQL Agent timers to execute on different times.

I have worked through a few things with RedGate support for log shipping since we go for 15 minute intervals (but on the 15 minute marks exactly).

Only once ever did I get a user reporting an error (reported from user):

9/11/2012:
9:00 am 2 mins
9:32 am 2 mins
9:46 am 5mins

This error came up once it unfroze:
Error at: 9/11/2012 9:49:19 AM

Error: [DBNETLIB][ConnectionWrite (send()).]General network error. Check your network documentation.


I have had the Network Admin check the event logs on the SQL server, and the switch logs. Nothing came up during the problematic times.

We have an audit mechanism that shows us how long individual queries take. During these freezing time periods, the time taken for execution SOMETIMES shows execution duration spikes when things are frozen. As in, it seem to keep the connection, just not do anything with it, and then resume after 2-5 minutes, and then log that the execution took longer. The mechanism is that built into the app the application takes a timestamp, runs the query, takes another timestamp, then logs what was executed and how long it took. There are spikes, but this doesn't really give detail as to WHY. The web application were no errors are seen has a timeout for some pages of Our main app has a built in timeout of 30 minutes. Web has between 2 and five depending on what is being used. Whatever is being used no one is reporting timeouts or thrown/shown errors.


  Post Attachments 
CPU.jpg (7 views, 393.17 KB)
Post #1358141
Posted Wednesday, September 12, 2012 2:15 PM
SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Yesterday @ 12:26 PM
Points: 2,742, Visits: 2,953
Is the database set to auto close?
Post #1358203
Posted Wednesday, September 12, 2012 3:23 PM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Monday, March 31, 2014 3:05 PM
Points: 175, Visits: 449
Auto Close is false on all system and users databases.
Post #1358233
Posted Thursday, September 13, 2012 8:53 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Monday, March 31, 2014 3:05 PM
Points: 175, Visits: 449
At 9 am, I could run sp_who2 and get query results. CPU was at 40ish percent and nothing I don't see often when no blocking was occurring (file attached)

At 9:30 am I could NOT run sp_who2 (query froze until the server was done doing whatever it was doing - finally returned results when it was done). The CPU level dropped to about 5% on all cores. When the connection came back there was a CPU spike, but with the spike was actual movement.

Sometimes the CPU spike is there and that causes a block. I just had a "freeze" at 9:50 which I have never had a freeze at 9:50 before, and watching task manager and all my other alert mechanism, nothing was any different from 9:49 or 9:48, visibly.


  Post Attachments 
2012-09-13-CPU.jpg (5 views, 200.66 KB)
2012-09-13-CPU_930am.jpg (6 views, 317.86 KB)
Post #1358628
Posted Thursday, September 13, 2012 11:17 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Monday, March 31, 2014 3:05 PM
Points: 175, Visits: 449
Not every time but a couple times this morning I have captured a small group (10-15) of ATTENTION events in Profiler AFTER these freezes are over. I very seldom otherwise see the attention event and it's coming from IIS so I figure those are timeouts from someone running a report. But when the freeze happens and ends, a few of the events are dumped in all at once, sometimes.

Saw this:

"The Attention event class indicates that an attention event, such as cancel, client-interrupt requests, or broken client connections, has occurred. Cancel operations can also be seen as part of implementing data access driver time-outs"

Also, one time of all the time during locking I got an error message from a user (this error has only ever been seen once during all the blocks):
"Error: [DBNETLIB][ConnectionWrite (send()).]General network error. Check your network documentation.
Called: frmMain:SearchTran"

We get this kind of error some times when our switch reboots. That error was only seen once though of all the countless freezing times.
Post #1358725
Posted Thursday, September 13, 2012 11:21 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Monday, March 31, 2014 3:05 PM
Points: 175, Visits: 449
Some times but not all when the freeze occurs, there are the two message multiple times in the SQL Agent Log:

[382] Logon to server 'MANNASQL2' failed (ConnAttemptCachableOp)
[165] ODBC Error: 0, Unable to complete login process due to delay in opening server connection [SQLSTATE 08001]

I have seen little resource on this except perhaps the Domain Controller pass through is suspect.

No other messages are seen in any event logs or server logs.


  Post Attachments 
sqlAgentLog.jpg (2 views, 234.31 KB)
Post #1358731
Posted Thursday, September 13, 2012 11:24 AM


SSC-Insane

SSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-Insane

Group: General Forum Members
Last Login: Today @ 6:59 AM
Points: 22,492, Visits: 30,186
I have to ask, is SQL Server the only component being monitored at this time? Just a gut feeling, but I am having a hard time believing this is a SQL Server problem, at least on its own. Something tells me there is more going on with this.



Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
Post #1358733
Posted Thursday, September 13, 2012 11:31 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Monday, March 31, 2014 3:05 PM
Points: 175, Visits: 449
The box is a dedicated machine and other processes are checked with no variance seen on the box when the problem occurs. The box has 36 GB of mem total where I allocated 20 GB for the SQL Server service. SSAS is also on the box, but very minimally used and our cube is re-processed once nightly. No one is touching it during the day on anything other than selectivity to query the cube.

Since day 2 there was suspect of domain controller pass through authentication, internal firewalls, and our switches.

Each of these have been checked as well as system and application event logs to nothing conclusive.

I had also started to monitor our DR server which has log shipping going to it. I have been looking at perfMon collections from our file servers and other servers to nothing that pops out to say those servers are having any issues at the same time (though not as thoroughly as the SQL server since I really need my Net Admin to be doing that).

The machine itself does not have any other processes taking any additional resources I can see (though admit I could very well be missing something).

However this all also has me now leading that I think it's pass through authentication... but I have nothing to prove at this time that it's not the SQL server as the SQL box has such inconsistent data captures in results when a freeze occurs.
Post #1358739
Posted Thursday, September 13, 2012 11:33 AM


SSC-Insane

SSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-Insane

Group: General Forum Members
Last Login: Today @ 6:59 AM
Points: 22,492, Visits: 30,186
Is this server the only server running SQL Server?



Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
Post #1358741
Posted Thursday, September 13, 2012 11:44 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Monday, March 31, 2014 3:05 PM
Points: 175, Visits: 449
We have a DR server for log shipping sitting dormant aside from restores. It's another dedicated box that used to be our production server 4 years ago. When my company upgraded the production server, the older one got sent to a DR facility and captures and restores logs. We have a "stage"/testing box with cpu license. We have a "development"/coding box on server with 5 or so dev cals and connections, sitting on a VM.

Post #1358750
« Prev Topic | Next Topic »

Add to briefcase 12345»»»

Permissions Expand / Collapse