Why A Query May be Faster the Second Time it Runs

Today we have a guest editorial from Kendra Little as Steve is away on his sabbatical.

A discussion came up on Twitter recently about database query performance: if you execute a query twice, it may be faster the second time it runs.

This can happen for a variety of reasons, but often the answer lies in how the database engine takes advantage of memory. For example, SQL Server will cache the data used by the query in the buffer pool whenever it can. If the data is not forced from the buffer pool due to failover, memory pressure, or another action, subsequent executions of this query (or other queries which happen to use the same data) can reference the data directly from memory and skip a slow trip to read the data from disk.

When testing this from an interactive query window against SQL Server, one of my longtime favorite commands is the following:

SET STATISTICS TIME, IO ON;

After this statement is run, you will receive information about how many reads were done by the query in the following categories:

Logical reads – the total number of reads done (these may or may not have read from disk)
Physical reads – a physical read copying data from disk into the buffer pool cache. In Statistics IO output this number does not include read-ahead reads.
Read-ahead reads – an optimized type of physical read (from disk).

You can learn roughly how much your query took advantage of data stored in the buffer pool cache by doing a little math: logical reads – (physical reads + read-ahead reads) = roughly how much data already in memory was used.

If you’d like to learn more about using the STATISTICS TIME and STATISTICS IO commands, I’ve written a post on some tricks you can use with these commands.