Latch waits

  • I've been noticing the following entries in the server's error log recently, and they're getting more common.

    2005-11-22 08:40:06.06 spid258   WARNING: EC 3cb81458, 0 waited 2400 sec. on latch 1b4cb20, class MISC.  Not a BUF latch.

    2005-11-22 08:40:06.06 spid258   Waiting for type 0x4, current count 0xa, current owning EC 0x000006F030482538.

    The process that logs this error is always running either a backup log or a checkpoint

    The SQL Swerver is runnng in an Active-Passive cluster environment with all databases and backups on a SAN

    Does anyone have any idea what could be causing this error?

    Thanks in advance

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Read from about half way down this link

    http://www.sql-server-performance.com/performance_monitor_counters_sql_server.asp

    Talks about what your latch waits are, and how it can be either memory or IO that is causing your messages...

    Google - troubleshooting's best friend - has MANY links on latch waits - this was merely the first but probably quite relevant to what you wanted. 

  • Thanks, interesting reading (the entire article)

    I'm tending to suspect there's a problem with the disk system (though the SAN people disagree, naturally). The server's not short on memory, got 48GB of memory.

    The latch waits seem to correspond with a backup log taking unusally long

    I'll browse google when I've got a few hours. The latch problem has been passed onto Microsoft for investigation since it was the cause of major downtime yesterday.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • I remember seeing this a very long time ago.  It was fixed by one of these two things:

    I turned off "fiber" mode.

    or

    I turned off the "priority boost" option.

    I don't remember which it was....

    http://support.microsoft.com/kb/310834

     

    hth jg

     

  • I have the same thing going on since the installation of SP4.  It displays more diagnostic info.  It doesn't mean that anything has changed with the disk, imho, you're just seeing it now.

  • I have the same thing going on since the installation of SP4.  It displays more diagnostic info.  It doesn't mean that anything has changed with the disk, imho, you're just seeing it now.

  • Gail,

    I started seeing the same warning message on one of the servers (EE 2000 SP4). just want to check with you whether Microsoft said anything about this warning? Should we ignore it or fix it? All 3 settings: priority boost, light weight pooling and set working set are turned off on the server.

    thanks,

    -Jessie

  • If latch activity is higher than expected, this often indicates one of two potential problems. First, it may mean your SQL Server could use more memory. If latch activity is high, check to see what your buffer cache hit ratio is. If it is below 99%, your server could probably benefit from more RAM. If the hit ratio is above 99%, it could be that the I/O system is contributing to the problem, and a faster I/O system might benefit your server's performance. I hope this help

  • Talk to your SAN guy to provide you SAN logs, that will be very helpful.

  • Thanks for your reply! We have 12GB memory on this server with only 5 user database.Biggest db is 65GB, JDE application. The average buffer hit ratio is 98, 99 which is OK. I'll ask for SAN logs. Still it's strange all of sudden, this warning shows up multiple times only in 1 day and the backup job took super long.

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply