Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 1234»»»

SQL Server 2005 Cluster - [sqsrvres] CheckQueryProcessorAlive: sqlexecdirect failed Expand / Collapse
Author
Message
Posted Monday, September 17, 2007 4:15 AM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Tuesday, October 5, 2010 10:52 AM
Points: 18, Visits: 42

Hi all,

Am hoping that you can help me with a problem that seems to be quite popular on the web but has very few definitive fixes.

Server/SQL Background:

2 nodes with Windows Server 2003 Enterprise Edition SP1 and 16GB RAM.  Memory is fixed at 14GB, 2Gb reserved for the OS.  Default instance.

SQL Server 2005 Standard Edition (Patch Level 9.0.2153).

Active/Passive Cluster configuration.

Application Log Errors:

[sqsrvres] printODBCError: sqlstate = 08S01; native error = 0; message = [Microsoft][SQL Native Client]Communication link failure

[sqsrvres] CheckQueryProcessorAlive: sqlexecdirect failed

[sqsrvres] OnlineThread: QP is not online.

[sqsrvres] printODBCError: sqlstate = 08S01; native error = 2746; message = [Microsoft][SQL Native Client]Communication link failure

[sqsrvres] printODBCError: sqlstate = 08S01; native error = 2746; message = [Microsoft][SQL Native Client]TCP Provider: An existing connection was forcibly closed by the remote host.

[sqsrvres] printODBCError: sqlstate = 08S01; native error = 40; message = [Microsoft][SQL Native Client]Communication link failure

[sqsrvres] printODBCError: sqlstate = 08S01; native error = 40; message = [Microsoft][SQL Native Client]TCP Provider: The specified network name is no longer available.

Cluster Log Errors:

00000f00.00001e08::2007/09/15-11:45:08.874 ERR  SQL Server <SQL Server>: [sqsrvres] CheckQueryProcessorAlive: sqlexecdirect failed
00000f00.00001e08::2007/09/15-11:45:08.874 ERR  SQL Server <SQL Server>: [sqsrvres] printODBCError: sqlstate = 08S01; native error = 2746; message = [Microsoft][SQL Native Client]TCP Provider: An existing connection was forcibly closed by the remote host.

00000f00.00001e08::2007/09/15-11:45:08.874 ERR  SQL Server <SQL Server>: [sqsrvres] printODBCError: sqlstate = 08S01; native error = 2746; message = [Microsoft][SQL Native Client]Communication link failure
00000f00.00001e08::2007/09/15-11:45:08.874 ERR  SQL Server <SQL Server>: [sqsrvres] OnlineThread: QP is not online.
00000f00.00001e08::2007/09/15-11:45:08.874 ERR  SQL Server <SQL Server>: [sqsrvres] CheckQueryProcessorAlive: sqlexecdirect failed
00000f00.00001e08::2007/09/15-11:45:08.874 ERR  SQL Server <SQL Server>: [sqsrvres] printODBCError: sqlstate = 08S01; native error = 0; message = [Microsoft][SQL Native Client]Communication link failure
00000f00.00001e08::2007/09/15-11:45:08.874 ERR  SQL Server <SQL Server>: [sqsrvres] CheckQueryProcessorAlive: sqlexecdirect failed
00000f00.00001e08::2007/09/15-11:45:08.874 ERR  SQL Server <SQL Server>: [sqsrvres] printODBCError: sqlstate = 08S01; native error = 0; message = [Microsoft][SQL Native Client]Communication link failure
00000f00.00001e08::2007/09/15-11:45:08.874 ERR  SQL Server <SQL Server>: [sqsrvres] CheckQueryProcessorAlive: sqlexecdirect failed
00000f00.00001e08::2007/09/15-11:45:08.874 ERR  SQL Server <SQL Server>: [sqsrvres] printODBCError: sqlstate = 08S01; native error = 0; message = [Microsoft][SQL Native Client]Communication link failure
00000f00.00001e08::2007/09/15-11:45:08.874 ERR  SQL Server <SQL Server>: [sqsrvres] CheckQueryProcessorAlive: sqlexecdirect failed
00000f00.00001e08::2007/09/15-11:45:08.874 ERR  SQL Server <SQL Server>: [sqsrvres] printODBCError: sqlstate = 08S01; native error = 0; message = [Microsoft][SQL Native Client]Communication link failure
00000f00.00001e08::2007/09/15-11:45:08.874 ERR  SQL Server <SQL Server>: [sqsrvres] CheckQueryProcessorAlive: sqlexecdirect failed
00000f00.00001e08::2007/09/15-11:45:08.874 ERR  SQL Server <SQL Server>: [sqsrvres] printODBCError: sqlstate = 08S01; native error = 0; message = [Microsoft][SQL Native Client]Communication link failure

Have tried a few things that have been suggested. The ones that spring to mind are:

- Ensured that the cluster service account has sufficient SQL access to run 'SELECT @@Servername'

- Fixed memory size

- Disabled SynAttackProtect in the registry

- Upgraded the HP iLO network card drivers

- Have followed the MS guidelines for cluster configuration

Any further suggestion would be great and much appreciated.

The error occurs completely randomly, around 4-5 times in any 24 hour period.  Also, failover does not occur, the cluster stays up and running, it just disappears (from a SQL point of view) off the network for a couple of seconds....

Cheers,

Dave.




Post #399630
Posted Tuesday, April 22, 2008 2:07 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Thursday, October 29, 2009 12:54 PM
Points: 3, Visits: 15
Hi Grasshopper
Did you ever get an answer to this, as I'm facing the exact same problem.
Thanks in advance!
Post #488895
Posted Wednesday, April 23, 2008 4:34 AM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Tuesday, October 5, 2010 10:52 AM
Points: 18, Visits: 42
Hi,

Nope, never did! I've done just about everything that I can do - except for opening a call with MS or rebuilding the cluster. I will be opening a all with MS this week though and will report anything back here.

Cheers,

Dave.



Post #489168
Posted Wednesday, April 23, 2008 10:49 AM


SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Friday, February 21, 2014 8:23 AM
Points: 34, Visits: 502
I am experiencing some issues with SQL Server 2005 installation.
Task Manager on the active node of the server shows Total Physical Memory :16776172 and Available Physical Memory : 15915488. Does this mean that SQL is not utlizing any memory? Also, SQL server agent log has events like "8 processor(s) and 4096 MB RAM detected.

Any help is very much appreciated.
Post #489429
Posted Thursday, May 1, 2008 11:58 AM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Tuesday, February 11, 2014 9:29 AM
Points: 27, Visits: 118
I have exactly the same issure on my cluster, any positive comments now?
Post #493817
Posted Thursday, May 1, 2008 2:48 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Saturday, October 31, 2009 6:42 PM
Points: 2, Visits: 57
Disable TCP Chimney, TCPA and RSS

HKLM-->System-->CurrentControlSet-->Services--TCPIP-->Parameters

Change the following entries to 0 and reboot

EnableTCPChimney
EnableTCPA
EnableRSS

Alternatively, set the KeepAlive setting via SQL Configuration Manager to a higher setting than the application side connect pool timeout setting.



Post #493915
Posted Wednesday, May 7, 2008 2:47 PM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Yesterday @ 9:42 AM
Points: 153, Visits: 157
so, guys do you have answer to this issue?
Post #496704
Posted Wednesday, May 7, 2008 3:22 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Thursday, October 29, 2009 12:54 PM
Points: 3, Visits: 15
So after talking with MS and EMC for over two weeks, we've gotton down to what our problem was\is.
When we upgrade our machine from 6gb to 36gb of memory, this allowed us to keep all our tables in memory, but the side effect was that now when running our maintance job, the in-memory tables now are flooding the disks with Write requests and it's overloading our disks with the number of iops per sec, which once the (San 4gb write cache) disk cache fills up on the san, then io comes to a screeching hault on the machine as it has to start retrying and it has to deal with the large amount of things waiting in queue on the server. This causes our node to become less than fully responsive as it pegs the first cpu (Which ironically in a defult setup is the only cpu that handles network requests). So for us the "Fix" was a few things"
1. We turned off the write cache on the lun that the db was attached to which in turn made the disks slower so that the host did not throw all the thousands of write requests onto the lun and it throttled better.
2. We also appplied a registery change as requested by MS that allows the network load to be assigned to any open processor rather than rely on one proc (I'll paste the article number and some more info below). This change by it's self may be one of the root causes of your cluster "Network" errors.
3. We also tuned the query to reduce the io load as well.
4. We change the lun from a raid 5 to a raid 10 (With a few extra disks for io performance) and turned back on the write cache on the san.

I'd check to see if your getting any io requests taking longer than 15 seconds as well, as this also points to a disk issue as well. As a test lowering our sql server memory settings back down to 6gb elimited the issue as well for us, as the disks had to read which slowed down the write issue.

So while we really improved performance for our users by adding in the additional ram, we didn't take into account the additional load that was now being able to be placed on the disks.

Fun stuff.


-Greg

Ms info:
The processor load is not distributed across multiple processors on a computer that is running Windows Server 2003, Windows 2000 Server, or Windows NT 4.0

System error 64 has occurred. The specified network name is no longer available.


http://support.microsoft.com/kb/892100

Procs HEX BIN
2 0x3 0b11
3 0x7 0b111
4 0xF 0b1111
8 0XFF 0b11111111


Also, some additional notes from our MS case:

Apply MANDATORY Microsoft Hotfix 946448, required for all STORport driver installations (for Windows 2003 SP1/2) - http://support.microsoft.com/kb/946448
Post #496736
Posted Wednesday, May 7, 2008 3:26 PM


SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Friday, February 21, 2014 8:23 AM
Points: 34, Visits: 502
I called Microsoft and was on phone with them for 6 hours yesterday. They were not able to give me an explanantion. They turned around and said that, this is the way, SQL 2005 is designed to work.


I still do not agree with Microsoft because, one of my friend who works for another IT firm do have SQL 2005 and he says that SQL 2005 should allocate all the available memory. If maximum 14 gigs of memory is configured on SQL SERVER, SQL should utlize every bit of it.

If any one has corrected the issue, please let me know, i shall be eager to know the solution.
Post #496740
Posted Wednesday, May 7, 2008 3:43 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Thursday, October 29, 2009 12:54 PM
Points: 3, Visits: 15
Sql 2005 will not immediatly allocate all the memory, It will just grab what is configured as it's min and grow from there. But once it's grown it wont release it unless called for by the os. Are you also using the "lock pages in memory" option? Definatly leave at least 2gb for the os outside of Sql though.
http://blogs.technet.com/askperf/archive/2008/03/25/lock-pages-in-memory-do-you-really-need-it.aspx
Post #496750
« Prev Topic | Next Topic »

Add to briefcase 1234»»»

Permissions Expand / Collapse