Blog Post

Index Black Ops Part 2 – Page IO Latch, Page Latch

,

As I mentioned in my TSQL2sDay index summary post, the I am writing a few posts on the information that is contained in sys.dm_db_index_operational_stats. The posts will be the following:

Today’s post is the second in the series and will discuss the page IO and page latch information that is available. We’ll cover what this information is. Afterwards we will look at querying the DMV for the data. And finally wrapping up with some idea on how you can use this data.

What Are Latches?

If you aren’t familiar with latches, they are very similar to locks within SQL Server. Though they are less intrusive and are design in a manner that prevents locking collisions, such as deadlocks. While locks control and manage the state of data in a SQL Server database, the latches will control and manage the location of the data.

  • Page Latch – These are waits that occur when a worker needs to wait for a page to become available. This typically occurs on a page is already available in memory.
  • Page IO Latch – These are waits that occur when a needs to wait for a page due to physical I/O. Such as when a page needs to be made available in the buffer pool for reading or writing and SQL Server needs to retrieve it from disk.

Like locks, latches aren’t a bad thing. Latches waiting excessively on other latches though… well too much of anything can be a bad thing.

Indexes with Latches

There are a few columns in sys.dm_db_index_operational_stats that are important when investigating latches on indexes. These columns are:

  • page_latch_wait_count (bigint) – Cumulative number of times the Database Engine waited, because of latch contention.
  • page_latch_wait_in_ms (bigint) - Cumulative number of milliseconds the Database Engine waited, because of latch contention.
  • page_io_latch_wait_count (bigint) – Cumulative number of times the Database Engine waited on an I/O page latch.
  • page_io_latch_wait_in_ms (bigint) - Cumulative number of milliseconds the Database Engine waited on a page I/O latch.
  • tree_page_latch_wait_count (bigint) – Subset of page_latch_wait_count that includes only the upper-level B-tree pages. Always 0 for a heap.
  • tree_page_latch_wait_in_ms (bigint) - Subset of page_latch_wait_in_ms that includes only the upper-level B-tree pages. Always 0 for a heap.
  • tree_page_io_latch_wait_count (bigint) – Subset of page_io_latch_wait_count that includes only the upper-level B-tree pages. Always 0 for a heap.
  • tree_page_io_latch_wait_in_ms (bigint) – Subset of page_io_latch_wait_in_ms that includes only the upper-level B-tree pages. Always 0 for a heap.

Now we won’t be looking at any queries with the tree latches. That’s something that can be covered in a more in-depth conversation. But suffice to say, want generally applies at the leaf level also applies at the tree level.

Page IO Latch Demo

We’ll start with a demo that will generate some page IO latch waits.  The query I am using to generate page IO latches is below.  All it is going to measure is the time that SQL Server spent waiting for the pages it needs to be read from disk.  If the disk on the server you are using is great compared to my laptop drive you may see different results.

USE AdventureWorks
GO
SELECT *
FROM Sales.SalesOrderDetail

Now we’ll run the query below to investigate the effect that the load had on the database:

SELECT OBJECT_SCHEMA_NAME(ios.object_id) + '.' + OBJECT_NAME(ios.object_id) as table_name
,i.name as index_name
,page_io_latch_wait_count
,page_io_latch_wait_in_ms
,CAST(100. * page_io_latch_wait_in_ms / NULLIF(page_io_latch_wait_count ,0) AS decimal(12,2)) AS page_io_avg_lock_wait_ms
FROM sys.dm_db_index_operational_stats (DB_ID(), NULL, NULL, NULL) ios
INNER JOIN sys.indexes i ON i.object_id = ios.object_id AND i.index_id = ios.index_id
WHERE OBJECTPROPERTY(ios.object_id,'IsUserTable') = 1
ORDER BY 5 DESC

If you don’t see the waits appear, try running multiple copies of the query or use a table that is larger.  That will do it for you.  The results of the query should look similar to the following:

This is a picture of query results.

As the picture indicates, there were page IO latch waits accumulated on the index PK_SalesOrderDetail_SalesOrderID_SalesOrderDetailID.  These occurred on the concurrent executions of the query above while they wait for the first query to load the data it needed into the buffer pool.  Since they were all going after the same data, they needed to wait instead of loading a copy for their use.

Page Latch Demo

As I mentioned in the introduction, page latches occur when a worker is picking up a page for an operation.  To generate some page latches we’ll execute an update statement.  And as before, the shoddy resources on my laptop will likely generate page latches.  You results may vary, but if they do just increase the volume of the query.

UPDATE sod
SET UnitPrice = UnitPrice * 1.05
FROM Sales.SalesOrderDetail sod

Now we’ll run the query below to investigate the effect that the load had on the database:

SELECT OBJECT_SCHEMA_NAME(ios.object_id) + '.' + OBJECT_NAME(ios.object_id) as table_name
,i.name as index_name
,page_latch_wait_count
,page_latch_wait_in_ms
,CAST(100. * page_latch_wait_in_ms / NULLIF(page_latch_wait_count ,0) AS decimal(12,2)) AS page_avg_lock_wait_ms
FROM sys.dm_db_index_operational_stats (DB_ID(), NULL, NULL, NULL) ios
INNER JOIN sys.indexes i ON i.object_id = ios.object_id AND i.index_id = ios.index_id
WHERE OBJECTPROPERTY(ios.object_id,'IsUserTable') = 1
ORDER BY 5 DESC

The results of the query should look similar to the following:

This is a picture of query results.

Like before, the effect of the page latches are seen above.

Life, The Universe, and Everything

Great!  With the scripts above I’ve demonstrated how you can identify indexes that have high latch occurrences and the amount of time spent on them.  It’s like I’m a computer in front of large crowd only will to say 42, over and over again.  Not exactly useful but we know it means something.

When will you know that you need to take a look at latches?  What situations will happen when knowing which index that latches are occurring on will these scripts assist you.

There are a number of reason you may want to look into these.  If you have a high volume of page IO latches on an index you may wonder why queries need to go to disk so often to to get data for the index.  Is this a memory issue where data is being pushed out?  Is there a report problem where a ad-hoc report is flooding the buffer pool with data that the OLTP system doesn’t need for its regular processing?

As I work through these kinds of issues with client, I plan to write more about them to share what I learn.  In the meantime, I’d recommend reading SQL Server 2005 Waits and Queues and Troubleshooting Performance Problems in SQL Server 2008. These are a couple excellent white papers that can get anyone going when it comes knowing when to look into latch issues.

Related posts:

  1. Index Black Ops Part 1 – Locks and Blocking
  2. Snapshots of Index Contention
  3. Monitoring Index Contention With DMFs

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating