Memory pressure issues

  • Hi everybody,

    We are running a Win2003 SP1 Server running SQL 2005 SP2 9.00.3042.00 Enterprise Edition. We have 16 processors and 32GB of memory. We begin to receive messages in the log:

    Message

    AppDomain 2 (MVXPRD.dbo[runtime].1) is marked for unload due to memory pressure.

    And then after, (sometimes hours later, sometimes minutes later) our SQL server stops responding with errors "not enough memory to run the query". All our users get hung up and we go down in flames. We are working with Microsoft on this issue but I thought I would throw it out to this great group of gurus and see if anyone has seen this before and maybe some things that were done to track down the cause. Here is what they are saying:

    "Two noticeable errors in SQL Log:

    1. Failed Virtual Allocate Bytes: FAIL_VIRTUAL_RESERVE 1048576

    2. Downgrading backup log buffers from 1024K to 64K

    To explain the second error, whenever we take log backup using VIRTUAL_DEVICE it tries to allocate memory from SQL MTL(MemToLeave) memory area, which is 384MB, by default. If, while performing the log operation, SQL does not have enough contiguous memory requested by log backup operation, SQL will try to shrink the buffer size to complete the log backup operation. From the “Downgrading backup log buffers from 1024K to 64K” message, it indicates MTL pressure which results either failure of the backup operation or slowness in completing the log operation and “Failed Virtual Allocate Bytes: FAIL_VIRTUAL_RESERVE 1048576” error is a definitive indication of MTL pressure.

    To provide more MTL area, because of accommodating larger memory request by SQL, you can implement the following action plan that will give you a relief on this memory pressure issue.

    1. Add –g512 switch as startup parameter and re-cycle SQL server

    2. Monitor the server performance for few days and check if the memory issue re-surfaces

    3. If yes, then we may need to figure out the root cause of the issue, as follows:

    a. Is there any in-process linked server? If yes, try to throw the linked servers out of process. Note: some linked servers stop working if we throw the associated .dll out-of-process.

    b. The VIRTUAL DEVICE backup request memory from MTL and it follows the calculation BLOCKSIZE + BUFFERCOUNT X MAXTRANSFERSIZE. It would be a good test if you can stop these VIRTUAL_DEVICE backup operations and try taking native SQL backups

    c. The error log doesn’t say that you’re using any 3rd party XProcs, that again takes memory from MTL

    d. You also don’t use sp_xml_preparedocument which again takes memory from MTL, if not using sp_xml_removedocument, will keep reserving the MTL memory not releasing it

    e. A large query plan may also occupy MTL area and that happens if either SQL is generating a bad plan or query is written badly. In this situation, you can either tune the memory or update the statistics of the tables on the database in regular interval, say every weekend. To update statistics for all the tables in a database, use the following command

    i. Use databasename

    ii. Exec sp_msforeachtable ‘update statistics ? with fullscan’"

    Then I got another technician telling me that we have too many databases (30) and my maintenance plans need to be reduced. I do a full backup of all databases nightly (1am) and 15 minute transaction log backups of only our production database. On Sunday, I run my maintenance tasks (rebuild indexes, update statistics and check database integrity). Here is what this analyst had to say:

    "According one of my previous mail there are +30 databases installed. It’s necessary to reduce the number. The maintenance jobs should be also reduced – there are overlapping jobs and e.g. backup ALL databases job"

    Does that sound right???? I have a Diagnostic tool that I run when we get these memory errors but I can't find any one thing that is causing this problem. Any help and/or suggestions from anyone will be greatly appreciated!!!

    Thanks,

    Isabelle

    Thanks!
    Bea Isabelle

  • ha! a real tricky one there. I've encountered some of these problems and resolved most of them before. you've been given some wrong advice however. The mem to leave area is indeed a problem with 32 bit systems - have you considerd going to 64bit?

    The maint plans issue is correct if you;re using sysmaint.exe ( sql 2000 ) as this and the xmlprepare and dtsrun all run out of process and this is what can screw things up. large plans are also a problem. try to limit the out of processes programs - please tell me you don't have the 3gb switch enabled.

    16 procs ? cores . cpu's or + HT?

    [font="Comic Sans MS"]The GrumpyOldDBA[/font]
    www.grumpyolddba.co.uk
    http://sqlblogcasts.com/blogs/grumpyolddba/

  • Hi,

    Unfortunately, 64bit is not something that we are thinking about at the moment. We only have /PAE enabled in the boot.ini

    This is what is in the boot.ini

    [boot loader]

    timeout=30

    default=multi(0)disk(0)rdisk(0)partition(2)\WINDOWS

    [operating systems]

    multi(0)disk(0)rdisk(0)partition(2)\WINDOWS="Windows Server 2003, Enterprise" /fastdetect /noexecute=optin /PAE

    Our CPU's are Intel Xeon MP 3 GHz proc’s – there are 4 physical CPU’s - but they are dual-core, hyperthreaded…

    Per Microsoft's suggestion, I have added the -g512 option to my SQL startup parameters which will increase our MTL from the default of 384MB to 512MB. I have a restart of our SQL Server scheduled for Sunday.

    I ran some scripts to get an idea of what we had going on in our production system and here is what I got:

    select sum(sqlbytes/1024) from syscacheobjects

    where sqlbytes > 8192;

    618 --> total number of pages in MTL

    SELECT SUM(PAGESUSED) FROM MASTER.DBO.SYSCACHEOBJECTS WHERE PAGESUSED >1;

    71817 * 8 (kb) / 1024 = 561 MB

    select count(*) FROM SYSCACHEOBJECTS;

    18084 * 8 (kb) / 1024 = 141 MB

    SELECT * FROM SYSCACHEOBJECTS WHERE PAGESUSED >1; --> this is what is being used in MTL

    17944 * 8 (kb) / 1024 = 140 MB

    Thanks,

    Isabelle

    Thanks!
    Bea Isabelle

  • well I'd advise you to disable hyperthreading, you may be suffering from too many threads, I know it sounds strange, but I've seen it a couple of times before.

    the -g will help for sure, it fixed many issues for me. There are some other problems with sql 2005 that can affect the size of the proc cache, I advise you to get to sp2 release 3186 as a minimum as I know this fixes some memory allocation " bugs" . Likewise there are some fixes which apply to the o/s if you're not at w2k3 sp2 I'd suggest you get to there too. I'm going to be posting some rdl's to my web site which I use for monitoring sql server, hopefully before christmas, I don't really want to start posting scripts here as it gets messy and some are quite complex.

    I'd suggest, as I usually do, that you download an eval of idera sql diagnostic manager and point it at your server ( hopefully you might also buy the product as it really is quite superb and frees up so much DBA time for other more pressing matters ) this will allow you to gather stats, trends and diagnostics.

    32bit is really old technology and much like the 640k is enough for anyone, the 2gb limit on 32 bit is in the same vein.

    [font="Comic Sans MS"]The GrumpyOldDBA[/font]
    www.grumpyolddba.co.uk
    http://sqlblogcasts.com/blogs/grumpyolddba/

  • try these: you're looking at multi page allocations as a %age of single and on the buckets you're looking at length of the chains ( high is maybe not good )

    select name, type, buckets_count, buckets_in_use_count,

    buckets_min_length, buckets_max_length, buckets_avg_length

    from sys.dm_os_memory_cache_hash_tables

    where type = 'CACHESTORE_SQLCP' or type = 'CACHESTORE_OBJCP';

    select type, name, memory_node_id, single_pages_kb, multi_pages_kb

    from sys.dm_os_memory_clerks

    where type = 'CACHESTORE_SQLCP' or type = 'CACHESTORE_OBJCP';

    select name, type, single_pages_kb, multi_pages_kb,

    single_pages_in_use_kb, multi_pages_in_use_kb

    from sys.dm_os_memory_cache_counters

    where type = 'CACHESTORE_SQLCP' or type = 'CACHESTORE_OBJCP';

    [font="Comic Sans MS"]The GrumpyOldDBA[/font]
    www.grumpyolddba.co.uk
    http://sqlblogcasts.com/blogs/grumpyolddba/

  • Hi,

    I will forward the suggestions about the hyperthreads to my sysadmin and take a look at the scripts you provided. Actually, we have purchased the SQL Diagnostic manager but I have not been able to start using it fully yet. Microsoft was pointing the finger at SQL DM because it uses the sp_OACreate and so my boss had me turn it off until we could get a handle on this memory issue. It is a great tool and I spent some time on the phone with one of their support techs getting things setup up properly.

    Thanks for the great info! I will see tomorrow if the -g switch helps.

    Isabelle

    Thanks!
    Bea Isabelle

  • used diag manager since 1999 , never found it to be a problem.

    [font="Comic Sans MS"]The GrumpyOldDBA[/font]
    www.grumpyolddba.co.uk
    http://sqlblogcasts.com/blogs/grumpyolddba/

  • 1) Version 3042 is a BAD version of SQL Server 2005 to be on. You are missing some very important updates related to the SP2 debacle(s). See here: http://blogs.msdn.com/psssql/archive/2007/04/06/post-sql-server-2005-service-pack-2-sp2-fixes-explained.aspx. Get patched up.

    2) Whenever dealing with MemToLeave issues you probably should be on the phone to Microsoft's Product Support team.

    3) 30 databases is nothing. I am supporting 6500+ databases on one server at one of my clients! 🙂 Your 'advisor' doesn't know what he is talking about.

    4) Make sure your 32 bit extended memory stuff is set up correctly. Search for PAE and AWE on the web and in BOL.

    5) I agree about disabling hyperthreading. Almost always results in better performance.

    6) Are you having overlapping backups at all? This is a often a bad thing.

    7) Consider getting a pro in to help you through this mess.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • I suggest you take a look at the white paper: "Troubleshooting Performance Problems in SQL Server 2005". The section on memory tuning /analysis is very good.

    Also, using the -g512 startup switch isn't really a fix. It's just a band-aid on the problem. Your issue is that you have either 1)not enough virtual memory or 2) too fragmented virtual memory. or both.

    A case with PSS on memory tuning may be in order also.

    Thanks.

  • Hi,

    I will take a look at that paper on SQL performance troubleshooting. We are currently working with a Microsoft engineer:

    Sanjay Karmakar

    Microsoft SQL Server Support Engineer

    I just thought I would see what other suggestions or experience other people had with this kind of issue. We knew that the -g512 switch was just a temporary fix, we were just trying to give us some breathing room while we try and figure out what is causing the problems. Currently we haven't had an alert since we put that switch so it gives me some time to do some research. I will keep you posted with a solution when we figure it out! Thanks so much for all your input. It really helps.

    Isabelle

    Thanks!
    Bea Isabelle

  • Just another question: Are you doing Database mirroring off of this server? We found that heavy use of Mirroring tends to use virtual memory a lot, especially on 32 bit.

    Craig

  • No, we are not currently using Database mirroring on this server.

    Isabelle

    Thanks!
    Bea Isabelle

  • virtual memory and memtoleave are totally different matters. the -g startup is not so much a fix as needed in certain circumstances. I've probaly mentioned this before but the real solution is 64bit. You should read up on Ken Henderson to understand memtoleave issues. I understand you can get memtoleave issues with sql litespeed, although I've not had those issues.

    [font="Comic Sans MS"]The GrumpyOldDBA[/font]
    www.grumpyolddba.co.uk
    http://sqlblogcasts.com/blogs/grumpyolddba/

  • We are having the same problems. Please update if the latest and greatest version of SP2 fixes this (9.00.3054).

  • Hi,

    We haven't experienced the memory pressure issue since we implemented the -g15 switch. I have applied the SP2 update on our test server (9.0.3186) but my boss wants me to wait before applying it to our production server. We want to give it some more time to see if the problems happens again. I plan on applying it to production for sure, but we want to see if the memory pressure is still an issue.

    Isabelle

    Thanks!
    Bea Isabelle

Viewing 15 posts - 1 through 15 (of 23 total)

You must be logged in to reply to this topic. Login to reply