Case Sensitivity in Selects - Part 1

If you don't have to deal with case sensitivity, count yourself lucky! SQL

2000 is much better than SQL 7 with regard for collations, supporting collations

at database, table, and column level. In this article I'm going to delve into a

interesting (or obvious?) item I noticed recently when I did some performance

tuning.

Basically we were doing a join to a table where we had to treat one column as

case sensitive. The table in question had about 130k rows and had an index (not

unique) on the column we were joining to. Not a huge table, a join to it should

be reasonably fast, yet I saw a high number of reads. Digging into the code -

which I sadly admit to writing - I found that we were handling the issue of case

sensitivity by converting the column to varbinary, a technique that works in

SQL7 or SQL 2000. An example written using a different table and simplified

syntax is shown here:

select * from instantforum_members where convert(varbinary(50), fullname) = 'andy warren'

If you're using SQL2K, you can accomplish the same thing or even better

depending on your needs by using a collation instead, like this:

select * from instantforum_members where fullname='andy warren' collate Latin1_General_CS_AI

Before we go on, let's start with what a simpler non case sensitive query

would look like and do:

select * from instantforum_members where fullname='andy warren'

You can see that just doing a simple select for my name results in a plan

that starts with an index seek, which is good, and has a cost of .003283. The

cost is relative, but in practical terms this is a very fast efficient query.

Now let's try the method I was using to handle sensitivity:

Here you can see the percentages shift as it does an index scan rather than

an index seek, no doubt caused by the convert statement which requires it to

touch every row. What's far more interesting is that the query cost jumps to

.508702, a huge jump! I ran the statement that used collate, not really

expecting a difference but you have to try, resulting in the exact same query

plan. That seems to make sense, it has to do the same work. Being curious, I

tried one more query specifying the collation as case insensitive and wound up

with the same results. Double checking the database found that I wasn't

specifying the correct & current collation which was

SQL_Latin1_General_CP1_CI_AS. Running one more time resulted in the same results

that I got in the first query with no collation specified.

Sorry if that is confusing. The net is that if you specify any collation

except the one in use on that column or you use a convert you're going to force

SQL to check each row, most likely resulting in either an index scan or a table

scan.

That got me thinking about how I could improve the performance without

changing the collation (an article for another day!). My thought process was

that the query itself was quick, the case sensitive compare slow only because it

had to look at every row, how to limit the query first and then do the compare.

For once it turned out to be exactly that easy, resulting in the following:

select * from instantforum_members where fullname = 'andy warren' and convert(varbinary(50), fullname) = 'andy warren'

Just that simple change gave me the results I needed at basically zero cost.

I've only tested on two tables so I can't say with certainty that the results I

show here are always true, but so far it seems to be consistent. It also shows

how even the simplest change can have an impact on query efficiency. In a follow

up article I'll look into others options including making the column case

sensitive.

Introduction to Parameterization in SQL Server

In a previous article I showed how to look at what query plans are in SQL Server's memory. In this article I'll look at how they got there and how and when they get reused. This is commonly called compiling a query plan. More specifically I'll look at how and when SQL Server can parameterize dynamic SQL

Using Bitwise Operators to Boost Performance

Bitwise operators can be challenging to manage at first. However, with practice and patience, and under the right conditions, these operators can provide remarkable performance improvements in production environments. This article will compare two methods of accomplishing the same output, one with a normalized model and the other with bitwise operators.

Understanding Execution Plans Part 1

The purpose of this article is to give you a working knowledge of how to view and understand query execution plans for SQL Server. This is part 1 in a series of articles that will walk you through understanding execution plans to help you improve your queries.

Making Dynamic Queries Static

Building and executing dynamic sql in a stored procedure - is it the only way to solve problems like supporting a simple search function? Leon offers a couple alternatives that let you continue to provide the functionality in a stored procedure without using dynamic sql. Interesting ideas worth exploring!

Case Sensitivity in Selects - Part 1

Rate

Share

Categories

Share

Rate

Case Sensitivity in Selects - Part 1

Rate

Share

Categories

Share

Rate

Related content

Introduction to Parameterization in SQL Server

Using Bitwise Operators to Boost Performance

Understanding Execution Plans Part 1

How Dynamic SQL Can Be Static SQL Part 2

Making Dynamic Queries Static

Cookies on SQLServerCentral