Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 

Ctrl-alt-geek

Matt Bowler is a DBA at Trade Me, and teaches database design and administration courses at the local institute of technology. A recent but avid discoverer of SQL Server, Matt is a regular contributor to forums at MSDN, Experts Exchange and SSC and he blogs at mattsql.wordpress.com.

Wildcards and Indexes

I was asked a good question the other day, one that I wasn’t sure of the answer. The question was “does SQL Server use an index if you have a wildcard in the search predicate?”

My initial thought was it would if the wildcard was not the first character but I thought it would be worth testing.

I wanted a nice simple table to use for testing so I grabbed an Id and first, middle and last names from one of the AdventureWorks sample databases and ported them into a new table in my sandpit test database:

select 
	BusinessEntityID,
	FirstName,
	MiddleName,
	LastName 
into sandpit.dbo.tbl_WildCardTest
from AdventureWorks2012.Person.Person

Have a look at what we’ve got:

select * from sandpit.dbo.tbl_WildCardTest

On my SQL Server 2012 instance with the AdventureWorks2012 database this returns 19972 rows.

Create a non-clustered index on the LastName column:

use sandpit
go

create nonclustered index ix_WildCardTest 
on tbl_WildCardTest(LastName)

Verify:

select * from sys.indexes 
where [object_id] = object_id('dbo.tbl_WildCardTest')

Okay, I’ve had a quick scan through and found a nice selective last name – one that under normal circumstances will use our non clustered index.

select * 
from sandpit.dbo.tbl_WildCardTest
where LastName like 'Dievendorff'

This returns a single row and the execution plan as you would expect has an index seek against the non clustered index followed by a RID lookup on the heap.

So switch on show actual execution plan by pressing [ctrl]+[m] or using this button in the GUI.

ExecutionPlanButton

And run a range of queries with a wildcard at the start, at the end, and both:

select * from sandpit.dbo.tbl_WildCardTest
where LastName like 'Dievendorf%'

select * from sandpit.dbo.tbl_WildCardTest
where LastName like '%ievendorff'

select * from sandpit.dbo.tbl_WildCardTest
where LastName like '%ievendorf%'

And check the plans.

ExecutionPlans1

So when the wild card is not at the start of the search predicate SQL Server chooses an index seek. When there is a wild card at the beginning of the predicate SQL Server needs to use a scan – the index is still chosen because a scan of a narrow index is lighter weight than a scan of the heap.

Even without the wild card at the start of the string eventually the selectivity of the predicate drops to the point that the index seek is abandoned in favour of a table scan.

select * from sandpit.dbo.tbl_WildCardTest
where LastName like 'Die%'

select * from sandpit.dbo.tbl_WildCardTest
where LastName like 'Di%'

ExecutionPlans2

What about if we run the same tests but this time limit the columns in the select list so that they are covered by the index?

select LastName from sandpit.dbo.tbl_WildCardTest
where LastName like 'Dievendorf%'

select LastName from sandpit.dbo.tbl_WildCardTest
where LastName like '%ievendorff'

select LastName from sandpit.dbo.tbl_WildCardTest
where LastName like '%ievendorf%'

ExecutionPlans3

So the way SQL Server chooses seeks or scans is the same – the difference is that now the query is covered so the RID lookup isn’t needed.

And as we lose selectivity…

select LastName from sandpit.dbo.tbl_WildCardTest
where LastName like 'Die%'

select LastName from sandpit.dbo.tbl_WildCardTest
where LastName like 'Di%'

ExecutionPlans4

This time because the index covers the query SQL Server chooses an index seek over the table scan and will do so even for the least selective query:

select LastName from sandpit.dbo.tbl_WildCardTest
where LastName like 'D%'

Conclusion: SQL Server does often use an index when a wildcard is present in the search predicate.

An index seek can be used when the wildcard is not at the start of the string, and there are enough literal characters before the wild card for the selectivity to warrant a seek.

If the wild card is at the start of the string then a scan must be used – SQL Server will choose to scan the index rather than the table because the index is narrower.

If the index covers the query then a RID (or clustered key) look-up won’t be required and SQL Server will choose an index seek for even the minimally selective predicate with a single literal character preceding the wildcard.


Comments

Leave a comment on the original post [mattsql.wordpress.com, opens in a new window]

Loading comments...