SQL 2000 cluster with AWE setting

Question

Post reply

SQL 2000 cluster with AWE setting

Denby

Hall of Fame

Points: 3120
More actions
January 2, 2009 at 3:05 pm

#197647

Hi guys,
We have been experiencing a major production issue, Memory usage on the active node is at 2.6 GB even though there is 8GB installed on each node.
SQL and OS version:
SQL: 2000 ent 32 bit SP4
OS: Windows 2003 enterprise 32bit SP1
Other settings:
1. Awe enabled
2. Max SQL server mem at 6.4 GB
3. Lock Pages granted to SQL service group
whenever this problem occur to fix it i had to failover to node 2........which will show mem usage up to 7.4GB. However after five days it went back down to 2.6 GB bringing SQL to a halt (we then failed back to node 1 and everything seem fine at least for now).
Has anyone experienced this before?
Any solutions to this problem is greatly appreciated

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply

george sibbald SSC Guru Points: 104210 More actions · Answer 1

possibly this bug?

http://support.microsoft.com/kb/899761

If I remember rightly the original release of SP4 contained this flaw, so it was re-released, so where did you get your SP4 executable from? Presumably you only applied it recently, so is it a copy of SP4 you have had for a while and are you sure its the latest version of it?

You say mem usage starts at 7.4GB even though SQL max memory is set to 6.4GB, so presumably other processes are using some of this. Are you running perfmon to track memory usage of processes on the server to see whats happening?

Are the PAE and 3GB switches set in the boot.ini?

---------------------------------------------------------------------

Denby Hall of Fame Points: 3120 More actions · Answer 2

Thanks for ur reply..

i recently inherited this environment 5 months however they had this same prob then.

SP version>> SP4 8.00.2040

The PEA swith is set in the boot.ini file on both servers.

The issue is we are seeing total memory usage at 2.6 GB in task manager, even thought max mem setting is at 6.4 GB.

As for tracking mem usage, i have been doing so by looking at DBCC memorystatus results but i don't see anything that would have indicated memory pressure see below.

(Buffer dist) (Buffers)

Stolen 3616

Free 1369

Procedures14952

Inram 0

Dirty 13809

Kept 0

I/O 0

Latched 184

Other 817053

(Buffer Counts) (VALUE)

Commited 850983

Target 850983

Hashed 831046

InternalReservation731

ExternalReservation356

Min Free 656

Visible 199456

(Dynamic Memory Man) (Buffers)

Stolen 18113

OS Reserved 2488

OS Committed 2466

OS In Use 2452

General 1897

QueryPlan 15037

Optimizer 21

Utilities 94

Connection 2396

ALSO SQL Server: (Total server memory) in perfmon is now showing 6.4 GB

george sibbald SSC Guru Points: 104210 More actions · Answer 3

when using AWE the value shown for sql server memory in task manager is incorrect, use the value from perfmon.

I would also set the /3GB switch in the boot.ini to take maximum advantage of the memory below the 4GB line.

what do you mean when you say sql server comes to a halt, are you sure its memory pressure? Any errors in the log?

track perfmon counters such as buffer cache hit ratio, page faults a second and available memory to show any high level signs of memory problems.

sql server target memory and sql server total memory should also be the same value.

---------------------------------------------------------------------

Denby Hall of Fame Points: 3120 More actions · Answer 4

The problem occured again this Morning.

I have been using the values from perfmon.........this time it is showing Total Memory at 1.5GB and Target mem at 1.5 GB.

as shown in the results below from DBCC memory Stats, Total SQL mem and Target mem is different from the results shown above.

(Current)

Commited206416 or 1.6GB

Target 206416 or 1.6GB

Hashed 201558

(Previuos)

Buffer Counts) (VALUE)

Commited 850983 or 6.8 GB

Target 850983 or 6.8GB

Buffer cache hit ratio currently: 99.86

Page faults/sec: 1879

Available physical mem: 5.4 GB

As far as the error logs are concerned, i'm seeing a lot of excessive locks\deadlook in the early morning during 3-5 am(occurs everyday).

Also from my understanding the \PEA setting should allow SQL to address memory above 4GB and we have 8 GB installed.

S. Kusen SSChampion Points: 10936 More actions · Answer 5

How much physical memory do you have on each node of the cluster? I'm assuming they are both configured with the same amount of memory.

Instead of failing the services over, try to simply stop and start them from Cluster administrator the next time you have the issue.

I recently had memory added to a SQL 2000 cluster, and stopping/starting the service after changing the max value was necessary. However, it almost seems as though other processes on the server are getting the memory before SQL Server needs it.

In SQL 2000 with the max memory option set, SQL will only grab that memory when it needs it.

I would try to set the min memory to 2GB and max to 6.4, stop and start SQL SErver, then check the perf mon values when it comes back up.

Do you have other services running on the systems? Any background services?

george sibbald SSC Guru Points: 104210 More actions · Answer 6

Hmmm, page\sec way too high. This value can be thrown out though by processes such as backing up large files to tape, as these really thrash the memory, so check your average is not being thrown out by periods of intense activity,

This value can also point to a disk bottleneck, so check out those counters too, see

http://www.microsoft.com/technet/prodtechnol/windows2000serv/reskit/core/fnec_evl_qnjo.mspx?mfr=true

Not sure why memory allocated to SQL is decreasing, but some other process must require it. Track process:private bytes and/or process:working set to see if anything is increasing. A slow constant increase could suggest a memory leak. I believe a sql fix between SP4 and rollup 2187 addressed a memory leak issue.

AWE allocation is dynamic in windows 2003 so perhaps attempt to fix the memory allocated to SQL by setting min and max the same. Make sure enough left for OS though, I presume no other apps running on your SQL cluster.

If anyone else out there got suggestions, feel free to chime in.:)

---------------------------------------------------------------------

SQLBOT SSCrazy Eights Points: 8014 More actions · Answer 7

george sibbald (1/3/2009)
Hmmm, page\sec way too high. This value can be thrown out though by processes such as backing up large files to tape, as these really thrash the memory, so check your average is not being thrown out by periods of intense activity,
This value can also point to a disk bottleneck, so check out those counters too, see
http://www.microsoft.com/technet/prodtechnol/windows2000serv/reskit/core/fnec_evl_qnjo.mspx?mfr=true
Not sure why memory allocated to SQL is decreasing, but some other process must require it.

I think the memory SQL Server is using is right where it needs to be.

The cache hit ratio is great so SQL is getting what it needs from the cache.

the system looks to be disk bound.

I bet there's high log writes/flushes per second and lots of log waits.

also I/O waits.

Anyway, the trick is to look at waits instead of measures (bad pun, but true).

Craig Outcalt

Tips for new DBAs: http://www.sqlservercentral.com/articles/Career/64632
My other articles: http://www.sqlservercentral.com/Authors/Articles/Craig_Outcalt/560258

george sibbald SSC Guru Points: 104210 More actions · Answer 8

SQLBOT (1/3/2009)
george sibbald (1/3/2009)
Hmmm, page\sec way too high. This value can be thrown out though by processes such as backing up large files to tape, as these really thrash the memory, so check your average is not being thrown out by periods of intense activity,
This value can also point to a disk bottleneck, so check out those counters too, see
http://www.microsoft.com/technet/prodtechnol/windows2000serv/reskit/core/fnec_evl_qnjo.mspx?mfr=true
Not sure why memory allocated to SQL is decreasing, but some other process must require it.
I think the memory SQL Server is using is right where it needs to be.
The cache hit ratio is great so SQL is getting what it needs from the cache.
the system looks to be disk bound.
I bet there's high log writes/flushes per second and lots of log waits.
also I/O waits.
Anyway, the trick is to look at waits instead of measures (bad pun, but true).

yea, I guess we need denby to confirm what he means by SQL 'comes to a halt'. Does it just get really slow or is it not possible to log in at all(via anything including EM, QA)

---------------------------------------------------------------------

Denby Hall of Fame Points: 3120 More actions · Answer 9

thanks again for the response.

(SQL coming to a Halt)>> Users complain of slow performances when this issue occurs and jobs timing out.

i also noticed this error in the event log:

Error 1229: Process ID %d:%d owns resources that are blocking processes on scheduler %d.

the above error poped up in event log for every 5 seconds for 3 hrs.

i tried something different this time, as SK suggested, i had went ahead and change min mem setting from default of 0 to 4Gb (i did this thinking SQL will throw an exeption error not enough memory available should this problem occur again) and restarted SQL service. Memory usage was back to 6.5GB(all is good, at least for now).

I also setup a trace log on memory counters to monitor memory usage and hopefully to see at what time does memory usage starts to fall.

Other services: OLAP and Report services

george sibbald SSC Guru Points: 104210 More actions · Answer 10

sounds like a server under stress problem caused by the app. This may help

http://support.microsoft.com/default.aspx?scid=kb;en-us;319892

next time it happens run sp_who2 to get info on processes running, I would expect there to be blocking going on (if a number of users complain about slowness and time outs occur, blocking is a frequent cause). A quick fix would be to kill the spid at the head of the blocking chain, this should correspond to the spid mentioned in the 1229 error message.

To get more info on this process, use dbcc inputbuffer, sp_locks and profiler and various selects from sysprocesses with order by to see whats going on. As SQLBOT suggested ordering by waittypes could help identify any bottlenecks.

---------------------------------------------------------------------

Denby Hall of Fame Points: 3120 More actions · Answer 11

Thanks for the info george i'll start looking into this.