SQLServerCentral Article

Bookmark Lookups

,

Bookmark Lookups

One of the major overheads associated with the use of non-clustered indexes is the cost of bookmark lookups. Bookmark lookups are a mechanism to navigate from a non-clustered index row to the actual data row in the base table (clustered index) and can be very expensive when dealing with large number of rows.

When a small number of rows are requested by a query, the SQL Server optimizer will try to use a non-clustered index on the column or columns contained in the WHERE clause to retrieve the data requested by the query. If the query requests data from columns not contained in the non-clustered index, SQL Server must go back to the data pages to obtain the data in those columns. It doesn’t matter if the table contains a clustered index or not, the query will still have to return to the table or clustered index to retrieve the data.

Bookmark lookups require data page access in addition to the index page access needed to filter the table data rows. Because this requires the query to access two sets of pages instead of only one, the number of logical READS performed by the query will increase. If the data pages are not in the buffer cache, additional I/O operations will be required. And in the case of most large tables, the index page and the corresponding data pages are not usually located close to each other on the disk.

These additional requirements for logical READS and physical I/O can cause bookmark lookups to become quite costly. While this cost may be acceptable in the case of small result sets, this cost becomes increasingly prohibitive in the case of larger and larger result sets. In fact, as the result sets become larger and larger, the optimizer may consider the costs of the bookmark lookups to be too much and discard the non-clustered index and simply perform a table scan instead.

Example of a Bookmark Lookup

SET STATISTICS PROFILE ON

GO

USE pubs

GO

--Find phone number for White, Johnson

SELECT phone

FROM dbo.authors

WHERE au_lname = 'White'

AND au_fname = 'Johnson'

GO

Execution Plan (abridged)

Rows Executes StmtText

----------- ----------- ----------------------------------------------------------------------------------

11 SELECT [phone]=[phone] FROM [dbo].[authors]

11 |--Bookmark Lookup(BOOKMARK:([Bmk1000]), OBJECT:([pubs].[dbo].[authors]))

11 |--Index Seek(OBJECT:([pubs].[dbo].[authors].[aunmind

Because both the au_lname and the au_fname are contained in a non-clustered index, the optimizer can use the non-clustered index to filter the rows contained in the table to return only the phone number requests. However, because the phone column in the authors table is not contained in the index or another non-clustered index, the optimizer must return to the authors table in order to return the matching phone number creating a bookmark lookup.

Finding the offending column(s)

In order to resolve the bookmark lookup, you must find the column or columns that cause the bookmark lookup. To find offending columns look for the index usage in the execution plan to find what index is utilized by the optimizer for the query.

Execution Plan (abridged)

StmtText

----------- ----------- ----------------------------------------------------------------------------------

|--Index Seek(OBJECT:([pubs].[dbo].[authors].[aunmind]), SEEK:([authors].[au_lname]='White' AND [authors].[au_fname]='Johnson') ORDERED FORWARD)

In this case we see that the authors.aunmind index is being used by the optimizer for the query. A quick check of the columns included in the index using sp_helpindex on the authors table will show that the index consists of the au_lname and au_fname columns.

Index_name    index_description                        index_keys

aunmind        nonclustered located on PRIMARY    au_lname, au_fname

A review of the execution plan OutputList column reveals that the phone column is only remaining column being requested by the query.

Execution Plan (abridged)

OutputList

----------------------------------------------------------------------------------

[authors].[phone]

Since the phone column is not in the index, you can deduct that the phone column is the offending column in this case.

Resolving bookmark lookups

Once you discover the columns responsible for a bookmark lookup, you will need to consider one of four methods that are available to resolve the bookmark lookup.

  1. Create a covering index
  2. Remove the offending column
  3. Convert a non-clustered index into a clustered index

Create a covering index

Given the example listed earlier in this section, if the following covering index had been created, the result would be the removal of the bookmark lookup from the execution plan.

CREATE NONCLUSTERED INDEX ncl_authors_phone ON authors(au_lname, au_fname, phone)

GO

Execution Plan (Abridged)

SELECT [phone]=[phone] FROM [dbo].[authors] WHERE [au_lname]=@1 AND [au_fname]=@2

|--Index Seek(OBJECT:([pubs].[dbo].[authors].[ncl_authors_phone]), SEEK:([authors].[au_lname]=[@1] AND [authors].[au_fname]=[@2]) ORDERED FORWARD)

Remove the offending column

In the simple query below, the developer returned all the columns from the authors table when all the query asked for was the ID of the author.

SET STATISTICS PROFILE ON

GO

USE pubs

GO

--Find ID number for White, Johnson

SELECT *

FROM dbo.authors

WHERE au_lname = 'White'

AND au_fname = 'Johnson'

GO

Execution Plan (abridged)

StmtText

----------- ----------- ----------------------------------------------------------------------------------

SELECT * FROM [dbo].[authors] WHERE [au_lname]=@1 AND [au_fname]=@2

|--Bookmark Lookup(BOOKMARK:([Bmk1000]), OBJECT:([pubs].[dbo].[authors]))

|--Index Seek(OBJECT:([pubs].[dbo].[authors].[aunmind]), SEEK:([authors].[au_lname]='White' AND [authors].[au_fname]='Johnson') ORDERED FORWARD)

Removing the additional, unneeded columns and only returning the au_id column will remove the bookmark lookup in this case

SET STATISTICS PROFILE ON

GO

USE pubs

GO

--Find ID number for White, Johnson

SELECT au_id

FROM dbo.authors

WHERE au_lname = 'White'

AND au_fname = 'Johnson'

GO

Execution Plan (abridged)

StmtText

----------- ----------- ----------------------------------------------------------------------------------

SELECT [au_id]=[au_id] FROM [dbo].[authors] WHERE [au_lname]=@1 AND [au_fname]=@2

|--Index Seek(OBJECT:([pubs].[dbo].[authors].[aunmind]), SEEK:([authors].[au_lname]=[@1] AND [authors].[au_fname]=[@2]) ORDERED FORWARD)

Bookmark lookups are often caused by additional columns being returned in the data set “just in case” they are needed at a later date. Developers should strive to only include columns in their result sets which are needed for the defined query requirements. Additional columns can always be added at a later date.

Convert a non-clustered index into a clustered index

When developers are faced with bookmark lookups that cannot be removed with the other choices described above, an alternative choice would be to convert an existing index being used by the query into a clustered index. Converting an existing index into a clustered index will place all the columns of the table in the index and prevent the need for a bookmark lookup.

SET STATISTICS PROFILE ON

GO

USE pubs

GO

--Find information for White, Johnson

SELECT fname + ' ' + lname + ' Hire Date: ' + CAST(hire_date AS VARCHAR(12))

FROM dbo.employee

WHERE emp_id = 'PMA42628M'

GO

Execution Plan (abridged)

StmtText

----------- ----------- ----------------------------------------------------------------------------------

SELECT fname + ' ' + lname + ' Hire Date: ' + CAST(hire_date AS VARCHAR(12)) FROM dbo.employee
WHERE emp_id = 'PMA42628M'

|--Compute Scalar(DEFINE:([Expr1002]=[employee].[fname]+' '+[employee].[lname]+' Hire Date:
'+Convert([employee].[hire_date])))

|--Bookmark Lookup(BOOKMARK:([Bmk1000]), OBJECT:([pubs].[dbo].[employee]))

|--Index Seek(OBJECT:([pubs].[dbo].[employee].[PK_emp_id]),
SEEK:([employee].[emp_id]='PMA42628M') ORDERED FORWARD)

To resolve the bookmark lookup, the developer can change the existing clustered index on the lname, fname, and minit columns into a non-clustered index.

--change original clustered index into a non-clustered index
DROP INDEX employee.employee_ind
GO
CREATE INDEX employee_ind ON employee(lname,fname,minit) 
GO

Once the clustered index has been changed into a non-clustered index, a new clustered index can be built on the emp_id column to resolve the bookmark lookup. In this particular case the emp_id is the PRIMARY KEY of the table, so instead of an index, the developer needs to recreate a clustered PRIMARY KEY.

--Create new clustered index
--Drop CONSTRAINT
ALTER TABLE employee
DROP CONSTRAINT PK_emp_id
GO
--Recreate CONSTRAINT
ALTER TABLE employee
ADD CONSTRAINT PK_emp_id PRIMARY KEY  CLUSTERED  (emp_id) 
GO
--Test removal of bookmark lookup 
--Find information for White, Johnson
SELECT fname + ' ' + lname + ' Hire Date: ' + CAST(hire_date AS VARCHAR(12)) 
FROM dbo.employee
WHERE emp_id = 'PMA42628M'
GO
Execution Plan (abridged) 
StmtText                                                                             
----------- ----------- ---------------------------------------------------------------------
SELECT fname + ' ' + lname + ' Hire Date: ' + CAST(hire_date AS VARCHAR(12))  FROM dbo.employee
                                                                 WHERE emp_id = 'PMA42628M'
  |--Compute Scalar(DEFINE:([Expr1002]=[employee].[fname]+' '+[employee].[lname]+' Hire Date: 
                                                   '+Convert([employee].[hire_date]))) 
       |--Clustered Index Seek(OBJECT:([pubs].[dbo].[employee].[PK_emp_id]), 
                         SEEK:([employee].[emp_id]='PMA42628M') ORDERED FORWARD) 

While converting a non-clustered index into a clustered index is a possible solution to bookmark lookups, often applications depend on the current clustered index and this solution will be almost impossible to implement in the real world.

Summary

Bookmark lookups are often performance robbing operator which are overlooked by many developers and DBAs. Understanding bookmark lookups and how they can be prevented can often help optimize a poorly running query and change a “bad” query into a good one.

About the author

I started working with SQL Server in the mid 1990's after spending time as both a Visual Basic developer

and Microsoft Access developer. Numerous projects upsizing Access to SQL Server lead me to become a full-time

SQL Server DBA and I have remained one ever since.

I have a large variety of experiences dealing with SQL Server, Oracle, Sybase, DB2, Access, and other database

platforms over the years and have worked with environments that had over 30 Terrabytes of data and environments

that had 100's of database with only a few megabytes of data. During this time I have worked as a production DBA,

a development DBA, a mixture of both and as a database architect and have enjoyed all of those positions.

Visit me on my web site and blog: www.TransactSQL.Com and

Blog.TransactSQL.Com

Rate

4.48 (31)

You rated this post out of 5. Change rating

Share

Share

Rate

4.48 (31)

You rated this post out of 5. Change rating