Connection to SQL Server dropping

  • Brandie Tarvin (11/28/2012)


    The error you posted doesn't look like a timed out connection, just a failed connection. That tells me that no connection was actually made. However, I'm not looking at everything you can see, therefore, all I could do was diagnose on what I could see.

    I put all that in my original post. The app DOES run and DOES communicate with the server. There is a set of text boxes on the active form that runs this process, and those boxes are updated as each piece of information is parsed, sent to the server, and the server's response is added to the display. It's a bit of visual candy, to keep the users from wondering whether the app is or is not doing anything, but it also serves to let me keep track of progress in the code. Every text line is read in, parsed, expanded to enumerate ranges (user puts in 10-20, I turn it into 10, 11, 12 13, 14,...,20) and the final expanded list of values is put into a text box. Each item in that line is then separately extracted, sent to the server for processing, and the results of EACH number are displayed in real time - removed from the expanded list, and an error message added into another box if there's a problem. Those numbers are peeled off at a rate of several dozen per second, even with the screen updating slowing things down. Each number calls a separate set of commands that open a recordset generated by a pass-through query, the result of the query extracted and the recordset closed again. If the server did not respond, these displays would not change. And that is exactly what happens, eventually. The numbers stop changing, because the app is not getting a response from the server, and a few seconds later, the afore-mentioned dialog pops up.

    When you say the app runs for a while, are you sure it's not timing out the connection? This would be one thing to look for in the logs.

    No, I'm not sure. I see no indication of a timeout; the error message just says "Connection failed", but doesn't go into details about why. There are the two state numbers (01000 and 08001) and one error number (17), but no text explanations and I haven't been able to figure out what those codes mean. And again, I increased the default ODBC connection timeout interval to tenfold its default value, and nothing changed.

    The text would be something along the lines of "Connection Timeout".

    I agree, it acts like a timeout, but that word does not appear anywhere in the communications from the server.

    Also, look for any warnings or errors around the time you had this problem (ignore the informational messages) to see if any of them directly relate to the database or to the application in question.

    I did look in the event log file on the disk: Program Files\Microsoft SQL Server\MSSQL.n\MSSQL\LOG\ERRORLOG and ERRORLOG.n

    There are no messages at all during the time that this was going on. No warnings, informational messages, nothing.

    Check for network issues as well. Your best bets are the Application & System logs in Event Viewer and the SQL Server logs (EDIT: Which I just read the post where you said you don't have them. Do any SQL logs exist for today? If not, you might have logging turned off or something else is going on with your server.).

    The log files on disk exist, and they contain other activity, but nothing during the time I was having these problems.

  • Brandie Tarvin (11/28/2012)


    Bhuvnesh (11/28/2012)


    WHat if we use SSIS to do this stuff more robust more tracable .. more comfortable to handle heavy volume data. i will welcome suggest on this suggest ๐Ÿ™‚

    If he's doing what I think he's doing (my workplace has a similar setup), SSIS doesn't work. This isn't a transfer of data from one system to another. It's an Access client application that uses SQL Server as its backend database.

    That is exactly what it is. MsAccess front end, SQL Server back end, ODBC connection.

  • Brandie Tarvin (11/28/2012)


    pdanes, have you checked your connection pooling properties both in the app and on the server to make sure you can reuse existing connections and that the pool isn't throttled so far back that it's causing the bump & dump?

    No, I haven't - I don't know how to do that. I'll look around for some info - do you happen to know of a tutorial on the subject? Interesting, though, this high-speed interrogation is the ONLY module causing problems. Otherwise, the app is often started and left connected for hours at a time, with sporadic use. That doesn't bother anything - the app and server respond instantly to everything, even after sitting idle for several hours.

  • I would go back to basics on this.

    If you can, copy your MSAccess app as a new file, zap everything but the code you are interested in - in that module, also zap everything to do with the form controls etc, try to stick to the bare minimum to query and retrieve the data, and display in query window or something.

    Re run code and see what happens .. F8 through it of course ๐Ÿ™‚ debug.print the generated SQL to double check it.

    Whilst doing the above, keep an eye on your logs (Windows mainly) but if you can, do set up a trace on your SQL Server to monitor the connection coming in and the generated SQL thrown at the DB.

    If from the above, you still can't identify the issue, then - if you can - get SQL Server express on another machine with default settings, copy your DB from original server to new server, repoint your slimmed down App from above to the new server, monitor the connections, generated SQL etc, and see what happens.

    I wonder if SQL Server Express has a hard coded max allowed connections? or some other restriction with regards to connections, data going back and forth?

    B

  • So you can't find any events that indicate the SQL Server services shut down, recycled, or that a connection got dropped *before* you got the error posted in your OP?

    And you did check the Application and System logs in Event Viewer? (Sorry to beat a dead horse, but I can't quite tell from your answer to that suggestion).

    I don't know what the connection pooling property in the Access app is, but in SQL Server, go to the instance name, right click, open properties. Navigate to Connections.

    What is the Maximum number of concurrent connections?

    What is the Remote query timeout?

    What (if any) default connection options do you have enabled?

    Do you require distributed transactions?

    Is the query governor enabled?

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • bleroy (11/29/2012)


    I would go back to basics on this.

    If you can, copy your MSAccess app as a new file, zap everything but the code you are interested in - in that module, also zap everything to do with the form controls etc, try to stick to the bare minimum to query and retrieve the data, and display in query window or something.

    Re run code and see what happens .. F8 through it of course ๐Ÿ™‚ debug.print the generated SQL to double check it.

    Whilst doing the above, keep an eye on your logs (Windows mainly) but if you can, do set up a trace on your SQL Server to monitor the connection coming in and the generated SQL thrown at the DB.

    If from the above, you still can't identify the issue, then - if you can - get SQL Server express on another machine with default settings, copy your DB from original server to new server, repoint your slimmed down App from above to the new server, monitor the connections, generated SQL etc, and see what happens.

    I wonder if SQL Server Express has a hard coded max allowed connections? or some other restriction with regards to connections, data going back and forth?

    B

    I can try making a stripped down version of the app and see it the problem remains. However, stepping through the code with F8 is unlikely to get me anywhere. As I wrote earlier, this only happens after around a minute of proper functioning, during which the code successfully processes many hundreds of record requests. Trying to duplicate that with F8 would take me weeks, and if it's actually some sort of saturation issue, it's extremely unlikely that I would cause it to happen by manually steppping the code, even if I had the patience to try it. I'm going to try putting in something like a couple of seconds SLEEP command every hundred records or so, see if that makes any difference.

    Also, I know where it's happening, since the loop that's executing when this bombs is one in which there is only a single call to SQL Server, to execute a stored procedure. I don't need to discover WHERE the problem is occuring during the execution of my code - I already know that, despite the fact that it doesn't trigger any trappable errors. What I don't know is WHAT is happening, or WHY. I can try to debug print the contents of every call, but since a reconnect allows the code to continue on and finish with no problems, I doubt if that would shed any light on the problem. Anyway, it's the same call, every time, only a single parameter value changes from call to call - the rest of the call is a string literal, right in the actual call. Zero possibility that it's getting inadvertantly munched somewhere in my code.

    The server-side trace is a good idea - this machine is essentially idle, so there is no problems with bogging it down. Duplicating the setup seems like an awful lot of work, and I don't have another server sitting around that I can use for that. I've got a migration planned to a paid version of SQL Server (already purchased, CD sitting on my desk), so if some more troubleshooting doesn't uncover the problem, I'll do the migration and see if that changes anything.

    As for connection numbers, there probably is some limit, but I don't see how I should be running up against that here. It's ONE app, running ONE task, essentially repeating the same call over and over. It should be using ONE connection for all these requests. If it is actually opening a new connection for each call, which I doubt, given the speed of execution, how would I persuade it to not do so?

  • Brandie Tarvin (11/29/2012)


    I don't know what the connection pooling property in the Access app is

    I don't believe Access manages this directly; it defers to the ODBC Manager & Driver for this. One would check via odbcad32, I believe.

  • I may be barking up the wrong tree here, but maybe you are running out of sockets? That's what TCP Error 10048 means, and I think that's probably what is being reported. The closedown of as connection (at the remote end) is not necessarily synchronous, so it may help to change TCP/IP parameters (on the server, not on the client) to cover a small delay. Alternatively it may help to change the number of sockets allowed. These are registry settings in HKLM\SYSTEM\CurrentControlSet\Services\TCPIP\Parameters

    the value name are TcpTimedWaitDelay and MaxUserPort.

    Of course if this is the problem the proper solution is to reuse the same TCP connection instead of closing it and opening a new one all the time, but that may be outside your control - certainly quite a lot of software insists on colsing connections and making new ones.

    Tom

  • Brandie Tarvin (11/29/2012)


    So you can't find any events that indicate the SQL Server services shut down, recycled, or that a connection got dropped *before* you got the error posted in your OP?

    No. Again, there is NOTHING in the log from the time period when this was going on. Normal, expected events several hours earlier (some failed login attempts) and about a half-hour later (scheduled nightly backup).

    And you did check the Application and System logs in Event Viewer? (Sorry to beat a dead horse, but I can't quite tell from your answer to that suggestion).

    No apologies necessary, Brandie, I'm grateful for any suggestion, even if it's one that doesn't work out. If I'm being less than clear in my responses to suggestions and questions, that's my fault, not yours.

    I just looked at the logs again. There is nothing in the application log from that time period (literally nothing, at all). The system log has several remarks about starting and stopping the Adobe Flash Player during that time. There is also this (translated from Czech as well as I can):

    "Working process with identifier 5944, which is servicing the application pool DefaultAppPool was terminated due to inactivity. The time limit for configuring this pool was set to 20 minutes. A new process will be started as necessary."

    The source given is WAS, and it's an informational message. But it only shows up three times in the log, and while I was playing with this, I experienced the problem many more than three times.

    I don't know what the connection pooling property in the Access app is, but in SQL Server, go to the instance name, right click, open properties. Navigate to Connections.

    Looking at it now. It's set to 0 (= unlimited). Also, the remote query timeout is 600 seconds, instead of the default 60, and the error occurs far earlier than ten minutes.

    What is the Maximum number of concurrent connections?

    0 (= unlimited)

    What is the Remote query timeout?

    600 seconds

    What (if any) default connection options do you have enabled?

    None - all boxes unchecked

    Do you require distributed transactions?

    No

    Is the query governor enabled?

    No

  • Banana-823045 (11/29/2012)


    Brandie Tarvin (11/29/2012)


    I don't know what the connection pooling property in the Access app is

    I don't believe Access manages this directly; it defers to the ODBC Manager & Driver for this. One would check via odbcad32, I believe.

    I don't know about that, but there is a box in the ODBC dialog for this. You check the box "Save long-running queries to file:" and that also activates a text box in which you specify the number of milliseconds that is considered long-running. It was 3000, I changed it to 30,000. Later I unchecked that box, which caused the value box to become disabled, but the tenfold increase seesm to have stuck, because the SQL Server instance query timeout value also shows 600 seconds, instead of the normal 60.

  • Actually, that's for a different purpose --- the intention with that one check is to log any query that takes too long. Think of it as a client-side performance tracing. It has nothing to do with managing connections and I don't think it'd help at all in your situation.

  • How about the data in the Express db? Does it fail trying to return the same record every time?

  • L' Eomot Inversรฉ (11/29/2012)


    I may be barking up the wrong tree here, but maybe you are running out of sockets? That's what TCP Error 10048 means, and I think that's probably what is being reported. The closedown of as connection (at the remote end) is not necessarily synchronous, so it may help to change TCP/IP parameters (on the server, not on the client) to cover a small delay. Alternatively it may help to change the number of sockets allowed. These are registry settings in HKLM\SYSTEM\CurrentControlSet\Services\TCPIP\Parameters

    the value name are TcpTimedWaitDelay and MaxUserPort.

    Neither of these keys exists anywhere on my server. The path you list is there, but those keys are not, there or anywhere else.

    Of course if this is the problem the proper solution is to reuse the same TCP connection instead of closing it and opening a new one all the time, but that may be outside your control - certainly quite a lot of software insists on colsing connections and making new ones.

    I can't discount the possibility, but I don't see error the #10048 that you mention anywhere, in hex or decimal. In any case, wouldn't such a refused attempt to connect show up in the error log? I can try explicitly opening a connection, instead of relying on Access to manage connections via the OpenRecordset calls, but I think that would mean changing all the attendant code to use ADO instead of DAO. That's certainly doable, but it's not exactly a trivial undertaking. But who knows? Changing to ADO might be enough all by itself to fix the problem. It's certainly worth a look, at least for this one module. Fortunately, Access is quite tolerant about using both ADO and DAO at the same time, as long as they're not mixed up, so I wouldn't have to change the entire app.

  • pdanes (11/29/2012)


    I just looked at the logs again. There is nothing in the application log from that time period (literally nothing, at all). The system log has several remarks about starting and stopping the Adobe Flash Player during that time. There is also this (translated from Czech as well as I can):

    "Working process with identifier 5944, which is servicing the application pool DefaultAppPool was terminated due to inactivity. The time limit for configuring this pool was set to 20 minutes. A new process will be started as necessary."

    The source given is WAS, and it's an informational message. But it only shows up three times in the log, and while I was playing with this, I experienced the problem many more than three times.

    So all your Server settings look the same as mine, which doesn't indicate that anything strange there is going on.

    But this message? This might be a clue. I'm going to do a little research and see if I can find out what this means.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • Banana-823045 (11/29/2012)


    Actually, that's for a different purpose --- the intention with that one check is to log any query that takes too long. Think of it as a client-side performance tracing. It has nothing to do with managing connections and I don't think it'd help at all in your situation.

    No, it didn't. That's why I turned it off again. I just tried turning it on again and changing the value back down to 3000 milliseconds. The 600 seconds in SQL Server did not change to match, so maybe my impression of parallel tenfold increase was erroneous. I'm going to try setting the SQL Server timeout value to 0 ( = unlimited) and see if that changes anything.

Viewing 15 posts - 16 through 30 (of 42 total)

You must be logged in to reply to this topic. Login to reply