SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


SQL Server 2005 Cluster - [sqsrvres] CheckQueryProcessorAlive: sqlexecdirect failed


SQL Server 2005 Cluster - [sqsrvres] CheckQueryProcessorAlive: sqlexecdirect failed

Author
Message
MikeM-150936
MikeM-150936
SSC Rookie
SSC Rookie (42 reputation)SSC Rookie (42 reputation)SSC Rookie (42 reputation)SSC Rookie (42 reputation)SSC Rookie (42 reputation)SSC Rookie (42 reputation)SSC Rookie (42 reputation)SSC Rookie (42 reputation)

Group: General Forum Members
Points: 42 Visits: 119
Everyone --

We are still working through the issue with Microsoft and EMC. The most current is that we have been to single out a node in our cluster to be our guinea pig. As of this morning...

- We have found out that we had outdated HBA drivers for our HBA to our EMC SAN. There is also a mandatory (says EMC) patch from Microsoft required for the HBA. It should be in place now too.
- We have also put in some Server Service/LanMan reg pokes microsoft suggested.

One of the new things they found was related to this:
"The errors we’ve been getting indicate that the Server service is unable to keep up with the demand for network work items that are queued by the network layer of the input/output (IO) stream.

There are many causes some of which only cause brief logging of error conditions (but may not cause failover) and these may be addressed by tuning the server service.

Disk subsystem not being able to keep up is the most common cause of the accumulation of work items in the server service. "

He then went on to suggest this reg poke:
To increase the capacity of the server service to handle incoming IO request please set the following registry settings (Hexadecimal values) using regedit.exe:

HKLM\SYSTEM\CurrentControlSet\Services\lanmanserver\parameters
"MaxFreeConnections"=dword:00000064
"MinFreeConnections"=dword:00000020
"MaxRawWorkItems"=dword:00000200
"MaxWorkItems"=dword:00002000

--- We'll see where this takes us... I am also collecting additional information for him as well... will keep you posted.

-- Mike
thomas.halligan
thomas.halligan
Grasshopper
Grasshopper (15 reputation)Grasshopper (15 reputation)Grasshopper (15 reputation)Grasshopper (15 reputation)Grasshopper (15 reputation)Grasshopper (15 reputation)Grasshopper (15 reputation)Grasshopper (15 reputation)

Group: General Forum Members
Points: 15 Visits: 35
Thanks for your continued efforts on this. The registry fixes I posted appear to prevent the unneeded SQL Cluster resource failures and the attendant 19019 errors, but I have suspected the underlying issue has not been resolved.

We have systems using HP and 3PAR based SANs that have been effected by this issue (We may have some systems attached to EMC SANS that are affected but I am unaware of any at this time). With the application of the fix recommended by Microsoft tech the Cluster resource failures have stopped and therefore the Clusters are no longer on the front burner, as it were.

I am continuing to see evidence of Disk subsystem issues (VSS & VDS errors which are interrupting backups)

Systems have been checked against the respective SAN configuration Matrixes for HBA drivers/firmware, MPIO etc etc

We have been applying http://support.microsoft.com/kb/943295 against some of the effected systems and the Jury is still out, but I have the feeling that we are not out of the woods regards this issue.

I will be paying close attention to this thread.
MikeM-150936
MikeM-150936
SSC Rookie
SSC Rookie (42 reputation)SSC Rookie (42 reputation)SSC Rookie (42 reputation)SSC Rookie (42 reputation)SSC Rookie (42 reputation)SSC Rookie (42 reputation)SSC Rookie (42 reputation)SSC Rookie (42 reputation)

Group: General Forum Members
Points: 42 Visits: 119
Everyone --

Still not out of the woods yet. This is becoming a long (and painful) process. I've tried everything Microsoft has suggested and still nothing.

As for Hotfixes, Thomas mentioned that he applied 943295. EMC Told us to put in Hotfix 943545... which I'm assuming is a newer fix to the one Thomas put in ?

Today thought that possibly the Communication link Failures I'm getting *may* be fixed by CU10.... yes, they're up to CU10 for SP2. But I think I disproved that this afternoon.

I re-ran the process on another SQL2005-x64 machine that is SAN connected but not in a cluster... ran clean as a whistle.

I'll keep you posted.
-- Mike
MikeM-150936
MikeM-150936
SSC Rookie
SSC Rookie (42 reputation)SSC Rookie (42 reputation)SSC Rookie (42 reputation)SSC Rookie (42 reputation)SSC Rookie (42 reputation)SSC Rookie (42 reputation)SSC Rookie (42 reputation)SSC Rookie (42 reputation)

Group: General Forum Members
Points: 42 Visits: 119
Everyone --

Wanted to check back in... we think we got it. It'll take a few days to pull everything together what we did and was microsoft recommended... but the tweak that has seemed to nail for us was that we made (in hindsight not so smart) mistake of SQL Server Priority Boost checked on the nodes in our cluster.

If any of you following this have done the same... uncheck it as soon as reasonbly possible. Again, in hindsight, If google around long enough you'll hit the articles that say you shouldn't haven't have this checked in a cluster... but it only says it could cause networking problems, but no details or specific messages...

In our case, we got to a point to where we stripped a node down to it's bare bones... we uninstalled EVERYTHING on the node that wasn't critical and (with boost on) ran a test where I would run a procedure that reliably causes the 19019 events, and profiler. SQL Profiler would see that the cluster service would get periodically dropped as a connection from SQL Server. THIS DROP is what was generating the 19019 errors! On a hunch, one of DBA's thought that *maybe* this priority boost thing might be choking other processes on the node... the cluster service being one of them... to give way to the higher-priority SQL Server.

Sure enough, we switched off this setting... not a single doggone 19019 since.

Like I said, I will try to get back to you folks within a few days, maybe a week, to compile everything we tried and all of Microsoft's recommendations based on our particular environment. In short, in no particular, things Microsoft sited in our environment:
1. Spikes in disk activity on our SAN (got better after applying latest drivers/hotfixes)
2. They claimed that our NIC cards in our nodes were teamed, which is a cluster no no (our cards were not teamed, period.)
3. They tried to reference an obscure match with Quest's SQL Litespeed causing the problem when using native command substitution... not buying this one... we've has litespeed for years and have always used teh xp_ procs... not command substitution
4. The suggested that we look at / play with our MAXDOP options... current set to 0 on each of our nodes. (this was suggested after we mentioned to them about us stumbling upon the priority boost thing).


Take it easy -- Mike
CoetzeeW
CoetzeeW
SSC-Enthusiastic
SSC-Enthusiastic (153 reputation)SSC-Enthusiastic (153 reputation)SSC-Enthusiastic (153 reputation)SSC-Enthusiastic (153 reputation)SSC-Enthusiastic (153 reputation)SSC-Enthusiastic (153 reputation)SSC-Enthusiastic (153 reputation)SSC-Enthusiastic (153 reputation)

Group: General Forum Members
Points: 153 Visits: 161
Hi

I have a newly build sql server cluster and getting these errors with no application volume at all.



fpereiras
fpereiras
SSC Rookie
SSC Rookie (26 reputation)SSC Rookie (26 reputation)SSC Rookie (26 reputation)SSC Rookie (26 reputation)SSC Rookie (26 reputation)SSC Rookie (26 reputation)SSC Rookie (26 reputation)SSC Rookie (26 reputation)

Group: General Forum Members
Points: 26 Visits: 160
Hi my name is Fabio Pereira, i'm brazilian, please you solved this problem, with the modification?

Thank.
Nicholas Cain
Nicholas Cain
Hall of Fame
Hall of Fame (3.2K reputation)Hall of Fame (3.2K reputation)Hall of Fame (3.2K reputation)Hall of Fame (3.2K reputation)Hall of Fame (3.2K reputation)Hall of Fame (3.2K reputation)Hall of Fame (3.2K reputation)Hall of Fame (3.2K reputation)

Group: General Forum Members
Points: 3216 Visits: 6200
Prakash.Bhojegowda (5/7/2008)
I called Microsoft and was on phone with them for 6 hours yesterday. They were not able to give me an explanantion. They turned around and said that, this is the way, SQL 2005 is designed to work.


I still do not agree with Microsoft because, one of my friend who works for another IT firm do have SQL 2005 and he says that SQL 2005 should allocate all the available memory. If maximum 14 gigs of memory is configured on SQL SERVER, SQL should utlize every bit of it.

If any one has corrected the issue, please let me know, i shall be eager to know the solution.


Your friend is correct within certain contexts.

If you are using a 32bit version of SQL2005 then it will not use all the memory should PAE and AWE not be enabled.
On a 64bit system it will use all of the memory, however will not take all of that memory immediately, it will start off small and then ramp up as memory is needed, up to the maximum that it is allocated.



Shamless self promotion - read my blog http://sirsql.net
465789psw
465789psw
Old Hand
Old Hand (344 reputation)Old Hand (344 reputation)Old Hand (344 reputation)Old Hand (344 reputation)Old Hand (344 reputation)Old Hand (344 reputation)Old Hand (344 reputation)Old Hand (344 reputation)

Group: General Forum Members
Points: 344 Visits: 472
Still looking for some answers on this? anyone figure it out?



Sunil-239779
Sunil-239779
Grasshopper
Grasshopper (18 reputation)Grasshopper (18 reputation)Grasshopper (18 reputation)Grasshopper (18 reputation)Grasshopper (18 reputation)Grasshopper (18 reputation)Grasshopper (18 reputation)Grasshopper (18 reputation)

Group: General Forum Members
Points: 18 Visits: 152
Any one have a solution for this issue.
Please reply..
Thanks in advance
Seth Lynch
Seth Lynch
Ten Centuries
Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)

Group: General Forum Members
Points: 1213 Visits: 603
Hi
We had a problem almost the same - Looked at CPU affinity, network cards etc.

In the end I found it was because Priority Boost was enabled on the installation (it was already there and the server failing before I arrived.)

Once I set this to 0 the mysterious reboots ended - along with the event viewer errors which used to happen 2 or 3 times a day.

Priority Boost changes require a service restart before they take affect.

Hope this help some of you

Seth

(actually just noticed someone else has said the same thing a few posts earlier - I'll leave this for those like me who get forum thread blindness)
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search