help understanding latches

  • Hello,

    I'm an SAN Admin and have no experience in MS SQL Server. I'm currently investigating a performance problem which I'm told is showing itself with high latch wait times. I understand that a latch is a form of lock, and I also understand that reading from a disk subsystem back to memory can result in latch waits.

    What I don't know is:

    1) Can latch waits occur when SQL is forced to write to the I/O subsystem?

    2) Can updates to the transaction log result in latch waits?

  • I'd probably suggest you find an experienced DBA as this isn't a simple question to answer. latches are lightweight "locks" and generally not too much to worry about, if lock usage goes high latches usually go high too. You need to read up on sql server waits and locking. as to latches vs disk perf - no not really i guess you're considering waits which show pageio_latch or similar - again you need to understand waits. I've a ms whitepaper for sql 2000 waits on my web site http://www.grumpyolddba.co.uk - go to the sql 2000 page.

    I usually find SAN's to be a sql bottleneck as most vendors/admins don't understand sql server io requirements. Basically most production sql servers need dedicated spindles ( no shared luns - at all ) and raid 10 / raid 1 any other raid is only useful for a read only database. Check partition allignment - there's some posts about this elsewhere on this section. use windows disk io completion time to check if you've got problems - block sizes on format and queue/buffer depth for HBA's will also help/hinder depending upon settings.

    [font="Comic Sans MS"]The GrumpyOldDBA[/font]
    www.grumpyolddba.co.uk
    http://sqlblogcasts.com/blogs/grumpyolddba/

  • WOW latches is a very broadbased discussion and sometimes very difficult to explain .

    As the previous replay you had states it is lightweight "locks" ...some people do refer to Lathes as locks but the terminologies have a huge difference.

    Locks can be controlled and ensures transaction integrity ...controlled meaning using different isloation levels, locking hints,lock settings like lock timeout values ...like I said more part of the transaction architecture of sql server ...

    Lathes are lightweight io syncronization objetcs that get's used deep in the sql server relational engine and storage engine . There are not many whitepapers and discussion around lathes and up and till today I still don't understand why ..maybe due to the complexity of the component within the engine.

    EI : when the realtion engine requires data from a storage engine ...it uses a latch in the process of requesting the data and supplying the data (in a nutshell) . The latch is being used to make sure the the data that is being retrieved from the pages are not madified ect . You see why some people say it is a lightweight lock ? ..

    Lathes are also used if database and logfiles extend ...the discussion can go to deep storage engine talks about unified extends and mixed extands ...global allocation maps and sgams ect ....

    You can monitor latches using perfmon ...normally look at latches that waits ...I have been in the situasion recently where I have accessive latch waitimes of 5-6 seconds per latchs and got the latcheing down not using the trace flag 1118 that reduces latch contension but by not autogrowing my databases and log files ..using multiple data files on multiple luns for tempdb ect ..

    Your dba also can look at queries which is a different ball game by itself ..if you want to see that queries can cause excessive latching

    Hope this helps

    W

  • Most people talk as if Locks & Latches do the same thing. They stating a Latch is a lightweight version of a Lock, which really doesn’t tell you much. I found it much simpler to understand if you view them as similar but completely different objects used for a completely different purpose. They aren't the same & one is not a subset of the other.

    SQL is a multi-threaded, multi-user application.

    As with all multi-threaded applications you need to ensure that the threads do not trip over each other. Mutexes, Latches and other approaches have been developed to perform the task of thread synchronisation. They are industry standard practices used in all Multi-threaded apps. For example if 1 thread is actually reading a page from disk & creating a memory structure to contain it, it will create 1 or more Latches (and perhaps other thread synisation objects) to prevent corruption of those structures. Once that operation is complete, the thread synchronisation objects (latches) will be released & the other threads will be able to access that page & memory structures again.

    Thus latches are mostly very transient, taking a few millisecs to complete. Unless of course the piece of code/memory they need is already "Reserved / blocked" by some other thread. In which case the requesting thread is put to sleep while it waits for the spinlock/latch/mutex to be released.

    On the other hand multi-user applications need a way to isolate user activity. a Lock is part of a user defined thing called a Transaction that prevents different users from overwriting each other’s changes. Generally a lock will last for the duration of the transaction (or in the case of shared locks, as long as is needed to ensure the transaction can be correctly rolled back.)

    Of course the 2 have a high degree of correlation. When a user reads a page, transactional integrity will require a shared lock on the rows or pages touched & perhaps Intent Shared Locks on the Table & Page.

    While the actual IO is happening latches will occur while memory buffers are being modified to contain & point to those pages retrieved. Unlike a Lock, Latches do not care about the transaction, they don’t get rolled back.

    As more IO occurs, there is a greater probability that your thread will be blocked by a latch caused by IO requested by another thread. Hence the correlation between increased IO contention & increased latch activity.

  • Please note: Almost 3 year old thread.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • That's one of the best explanations I've read so far. Thanks lots.

  • Yes, That's one of the best explanations I've read so far. Thanks lots.

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply