SQL 2000 cluster with AWE setting

  • Hi guys,

    We have been experiencing a major production issue, Memory usage on the active node is at 2.6 GB even though there is 8GB installed on each node.

    SQL and OS version:

    SQL: 2000 ent 32 bit SP4

    OS: Windows 2003 enterprise 32bit SP1

    Other settings:

    1. Awe enabled

    2. Max SQL server mem at 6.4 GB

    3. Lock Pages granted to SQL service group

    whenever this problem occur to fix it i had to failover to node 2........which will show mem usage up to 7.4GB. However after five days it went back down to 2.6 GB bringing SQL to a halt (we then failed back to node 1 and everything seem fine at least for now).

    Has anyone experienced this before?

    Any solutions to this problem is greatly appreciated

  • possibly this bug?

    http://support.microsoft.com/kb/899761

    If I remember rightly the original release of SP4 contained this flaw, so it was re-released, so where did you get your SP4 executable from? Presumably you only applied it recently, so is it a copy of SP4 you have had for a while and are you sure its the latest version of it?

    You say mem usage starts at 7.4GB even though SQL max memory is set to 6.4GB, so presumably other processes are using some of this. Are you running perfmon to track memory usage of processes on the server to see whats happening?

    Are the PAE and 3GB switches set in the boot.ini?

    ---------------------------------------------------------------------

  • Thanks for ur reply..

    i recently inherited this environment 5 months however they had this same prob then.

    SP version>> SP4 8.00.2040

    The PEA swith is set in the boot.ini file on both servers.

    The issue is we are seeing total memory usage at 2.6 GB in task manager, even thought max mem setting is at 6.4 GB.

    As for tracking mem usage, i have been doing so by looking at DBCC memorystatus results but i don't see anything that would have indicated memory pressure see below.

    (Buffer dist) (Buffers)

    Stolen 3616

    Free 1369

    Procedures14952

    Inram 0

    Dirty 13809

    Kept 0

    I/O 0

    Latched 184

    Other 817053

    (Buffer Counts) (VALUE)

    Commited 850983

    Target 850983

    Hashed 831046

    InternalReservation731

    ExternalReservation356

    Min Free 656

    Visible 199456

    (Dynamic Memory Man) (Buffers)

    Stolen 18113

    OS Reserved 2488

    OS Committed 2466

    OS In Use 2452

    General 1897

    QueryPlan 15037

    Optimizer 21

    Utilities 94

    Connection 2396

    ALSO SQL Server: (Total server memory) in perfmon is now showing 6.4 GB

  • when using AWE the value shown for sql server memory in task manager is incorrect, use the value from perfmon.

    I would also set the /3GB switch in the boot.ini to take maximum advantage of the memory below the 4GB line.

    what do you mean when you say sql server comes to a halt, are you sure its memory pressure? Any errors in the log?

    track perfmon counters such as buffer cache hit ratio, page faults a second and available memory to show any high level signs of memory problems.

    sql server target memory and sql server total memory should also be the same value.

    ---------------------------------------------------------------------

  • The problem occured again this Morning.

    I have been using the values from perfmon.........this time it is showing Total Memory at 1.5GB and Target mem at 1.5 GB.

    as shown in the results below from DBCC memory Stats, Total SQL mem and Target mem is different from the results shown above.

    (Current)

    Commited206416 or 1.6GB

    Target 206416 or 1.6GB

    Hashed 201558

    (Previuos)

    Buffer Counts) (VALUE)

    Commited 850983 or 6.8 GB

    Target 850983 or 6.8GB

    Buffer cache hit ratio currently: 99.86

    Page faults/sec: 1879

    Available physical mem: 5.4 GB

    As far as the error logs are concerned, i'm seeing a lot of excessive locks\deadlook in the early morning during 3-5 am(occurs everyday).

    Also from my understanding the \PEA setting should allow SQL to address memory above 4GB and we have 8 GB installed.

  • How much physical memory do you have on each node of the cluster? I'm assuming they are both configured with the same amount of memory.

    Instead of failing the services over, try to simply stop and start them from Cluster administrator the next time you have the issue.

    I recently had memory added to a SQL 2000 cluster, and stopping/starting the service after changing the max value was necessary. However, it almost seems as though other processes on the server are getting the memory before SQL Server needs it.

    In SQL 2000 with the max memory option set, SQL will only grab that memory when it needs it.

    I would try to set the min memory to 2GB and max to 6.4, stop and start SQL SErver, then check the perf mon values when it comes back up.

    Do you have other services running on the systems? Any background services?

  • Hmmm, page\sec way too high. This value can be thrown out though by processes such as backing up large files to tape, as these really thrash the memory, so check your average is not being thrown out by periods of intense activity,

    This value can also point to a disk bottleneck, so check out those counters too, see

    http://www.microsoft.com/technet/prodtechnol/windows2000serv/reskit/core/fnec_evl_qnjo.mspx?mfr=true

    Not sure why memory allocated to SQL is decreasing, but some other process must require it. Track process:private bytes and/or process:working set to see if anything is increasing. A slow constant increase could suggest a memory leak. I believe a sql fix between SP4 and rollup 2187 addressed a memory leak issue.

    AWE allocation is dynamic in windows 2003 so perhaps attempt to fix the memory allocated to SQL by setting min and max the same. Make sure enough left for OS though, I presume no other apps running on your SQL cluster.

    If anyone else out there got suggestions, feel free to chime in.:)

    ---------------------------------------------------------------------

  • george sibbald (1/3/2009)


    Hmmm, page\sec way too high. This value can be thrown out though by processes such as backing up large files to tape, as these really thrash the memory, so check your average is not being thrown out by periods of intense activity,

    This value can also point to a disk bottleneck, so check out those counters too, see

    http://www.microsoft.com/technet/prodtechnol/windows2000serv/reskit/core/fnec_evl_qnjo.mspx?mfr=true

    Not sure why memory allocated to SQL is decreasing, but some other process must require it.

    I think the memory SQL Server is using is right where it needs to be.

    The cache hit ratio is great so SQL is getting what it needs from the cache.

    the system looks to be disk bound.

    I bet there's high log writes/flushes per second and lots of log waits.

    also I/O waits.

    Anyway, the trick is to look at waits instead of measures (bad pun, but true).

  • SQLBOT (1/3/2009)


    george sibbald (1/3/2009)


    Hmmm, page\sec way too high. This value can be thrown out though by processes such as backing up large files to tape, as these really thrash the memory, so check your average is not being thrown out by periods of intense activity,

    This value can also point to a disk bottleneck, so check out those counters too, see

    http://www.microsoft.com/technet/prodtechnol/windows2000serv/reskit/core/fnec_evl_qnjo.mspx?mfr=true

    Not sure why memory allocated to SQL is decreasing, but some other process must require it.

    I think the memory SQL Server is using is right where it needs to be.

    The cache hit ratio is great so SQL is getting what it needs from the cache.

    the system looks to be disk bound.

    I bet there's high log writes/flushes per second and lots of log waits.

    also I/O waits.

    Anyway, the trick is to look at waits instead of measures (bad pun, but true).

    yea, I guess we need denby to confirm what he means by SQL 'comes to a halt'. Does it just get really slow or is it not possible to log in at all(via anything including EM, QA)

    ---------------------------------------------------------------------

  • thanks again for the response.

    (SQL coming to a Halt)>> Users complain of slow performances when this issue occurs and jobs timing out.

    i also noticed this error in the event log:

    Error 1229: Process ID %d:%d owns resources that are blocking processes on scheduler %d.

    the above error poped up in event log for every 5 seconds for 3 hrs.

    i tried something different this time, as SK suggested, i had went ahead and change min mem setting from default of 0 to 4Gb (i did this thinking SQL will throw an exeption error not enough memory available should this problem occur again) and restarted SQL service. Memory usage was back to 6.5GB(all is good, at least for now).

    I also setup a trace log on memory counters to monitor memory usage and hopefully to see at what time does memory usage starts to fall.

    Other services: OLAP and Report services

  • sounds like a server under stress problem caused by the app. This may help

    http://support.microsoft.com/default.aspx?scid=kb;en-us;319892

    next time it happens run sp_who2 to get info on processes running, I would expect there to be blocking going on (if a number of users complain about slowness and time outs occur, blocking is a frequent cause). A quick fix would be to kill the spid at the head of the blocking chain, this should correspond to the spid mentioned in the 1229 error message.

    To get more info on this process, use dbcc inputbuffer, sp_locks and profiler and various selects from sysprocesses with order by to see whats going on. As SQLBOT suggested ordering by waittypes could help identify any bottlenecks.

    ---------------------------------------------------------------------

  • Thanks for the info george i'll start looking into this.

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply