Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase ««123»»

LIKE with and without wildcards in WHERE clause Expand / Collapse
Author
Message
Posted Monday, November 26, 2012 11:52 AM


SSCoach

SSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoach

Group: General Forum Members
Last Login: Friday, June 27, 2014 12:43 PM
Points: 15,444, Visits: 9,596
The T-SQL query optimizer doesn't really shortcut the way you're thinking.

I took a relatively large table on one of my test servers, added an index to an nvarchar(75) column, and tried both query constructs. The Select statement only covered the clustered index key, so no lookups were needed.

One query had a Where clause like this:

WHERE MyColumn LIKE @Input1 + '%'
AND MyColumn2 LIKE @Input2 + '%'

The other had:

WHERE (@Input1 = '' OR MyColumn LIKE @Input1 + '%')
AND (@Input2 = '' OR MyColumn2 LIKE @Input2 + '%')

The first version, without the OR statements, gets an Index Seek in the Actual Execution Plan, the second gets an Index Scan. Both use the index I created to test this.

I added a column to the query that wasn't included in the index, and the first one got an Index Seek + a Key Lookup, while the second remained as a full Index Scan (of course).

Not that it means very much, but the one with the seek and the lookup was estimated at 17% of the total work, while the scan was estimated at 83%.

On the data volume I'm dealing with (a few thousand rows in this case), the actual compile+execution time for both was comparable, with the Index Seek being consistently slightly faster. Like 15 milliseconds vs 16 milliseconds. Larger data volume and more complex searches (like '%' + @Input + '%', to find strings inside a column, instead of just on the leading edge), the speed advantage will be more pronounced.


- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
Post #1388758
Posted Monday, November 26, 2012 3:51 PM
SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Today @ 3:54 PM
Points: 1,970, Visits: 2,910
pdanes (11/26/2012)
DiverKas (11/26/2012)
But wouldn't this always compare all variables, though? In the syntax with OR, the query engine can see that a variable is empty, thereby satisfying one side of the OR condition, and need not even look in the database.


I am not sure that this is true. This isn't .NET where code branches based on one side of an IF or SWITCH. From what I have seen, the query engine runs both sides and then compares.

But thats just me.

I don't know either. In VBA, such a comparison does get all parameters evaluated before the comparison is performed. In the Access Jet engine, it does not - a SWITCH function, for instance, stops evaluating when the first true condition is reached, so you can put in computations that yield a Null or unknown value, if something else stops the eval first, which is not possible in VBA.

But the SQL Server optimizer is so smart about avoiding unnecessary work, it seems to me that in a situation when an already known value of a local variable can -completely- obviate the need to examine table data, it would most certainly do so. Maybe examining a few query plans would tell the story, but I'm still not very good at reading those, despite many hours spent over tutorials on the subject.



SQL can indeed do short-circuiting, and sometimes it will. It might also test things in a different order than you've specified. For example, you write "A = B OR C = D", SQL might test "C = D" first, and then skip "A = B", you really can't be sure which, if either, will happen.

A CASE statement, however, is guaranteed to work in order. So you might try coding it like this:

WHERE 
(1 = CASE
WHEN @input_variable = '' THEN 1
WHEN column_name LIKE @input_variable THEN 1
ELSE 0 END) AND
(1 = CASE ... END) AND ...


No guarantees, but it might be worth trying.


SQL DBA,SQL Server MVP('07, '08, '09)
"In America, every man is innocent until proven broke!" Brant Parker
Post #1388880
Posted Tuesday, November 27, 2012 6:32 AM


SSCoach

SSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoach

Group: General Forum Members
Last Login: Friday, June 27, 2014 12:43 PM
Points: 15,444, Visits: 9,596
ScottPletcher (11/26/2012)
pdanes (11/26/2012)
DiverKas (11/26/2012)
But wouldn't this always compare all variables, though? In the syntax with OR, the query engine can see that a variable is empty, thereby satisfying one side of the OR condition, and need not even look in the database.


I am not sure that this is true. This isn't .NET where code branches based on one side of an IF or SWITCH. From what I have seen, the query engine runs both sides and then compares.

But thats just me.

I don't know either. In VBA, such a comparison does get all parameters evaluated before the comparison is performed. In the Access Jet engine, it does not - a SWITCH function, for instance, stops evaluating when the first true condition is reached, so you can put in computations that yield a Null or unknown value, if something else stops the eval first, which is not possible in VBA.

But the SQL Server optimizer is so smart about avoiding unnecessary work, it seems to me that in a situation when an already known value of a local variable can -completely- obviate the need to examine table data, it would most certainly do so. Maybe examining a few query plans would tell the story, but I'm still not very good at reading those, despite many hours spent over tutorials on the subject.



SQL can indeed do short-circuiting, and sometimes it will. It might also test things in a different order than you've specified. For example, you write "A = B OR C = D", SQL might test "C = D" first, and then skip "A = B", you really can't be sure which, if either, will happen.

A CASE statement, however, is guaranteed to work in order. So you might try coding it like this:

WHERE 
(1 = CASE
WHEN @input_variable = '' THEN 1
WHEN column_name LIKE @input_variable THEN 1
ELSE 0 END) AND
(1 = CASE ... END) AND ...


No guarantees, but it might be worth trying.


CASE statement like that in the Where clause guarantees index/table scans.


- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
Post #1389150
Posted Tuesday, November 27, 2012 6:49 AM
SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Monday, July 14, 2014 9:47 AM
Points: 2,919, Visits: 2,525
Have you considered a full text index? I have had very good luck in some cases with issues such as yours. Check it out here and see if it would work for your case.

http://msdn.microsoft.com/en-us/library/ms345119(v=sql.90).aspx


It will not always fit your needs, but may be an option to explore.



Steve Jimmo
Sr DBA
“If we ever forget that we are One Nation Under God, then we will be a Nation gone under." - Ronald Reagan
Post #1389162
Posted Tuesday, November 27, 2012 9:27 AM


Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Wednesday, July 23, 2014 2:54 PM
Points: 326, Visits: 926
sjimmo (11/27/2012)
Have you considered a full text index? I have had very good luck in some cases with issues such as yours. Check it out here and see if it would work for your case.

http://msdn.microsoft.com/en-us/library/ms345119(v=sql.90).aspx


It will not always fit your needs, but may be an option to explore.

I have never used one of those, but this query can adress derived fields in up to five joined views. I'll take a look at it, but that doesn't seem like the best candidate to start on something I know nothing about. Thanks for the tip, though.
Post #1389281
Posted Tuesday, November 27, 2012 9:34 AM


Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Wednesday, July 23, 2014 2:54 PM
Points: 326, Visits: 926
There doesn't seem to be any clear consensus on all this, so I guess I'll have to build a bunch of test cases and see what comes of it. Still seems to me that the SS engine ought to know that testing a single parameter value is cheaper than looking through a table, and should try that first, but maybe I'm expecting too much of it (for once). Thank you everyone for the input.
Post #1389286
Posted Tuesday, November 27, 2012 11:13 AM


SSCoach

SSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoach

Group: General Forum Members
Last Login: Friday, June 27, 2014 12:43 PM
Points: 15,444, Visits: 9,596
pdanes (11/27/2012)
There doesn't seem to be any clear consensus on all this, so I guess I'll have to build a bunch of test cases and see what comes of it. Still seems to me that the SS engine ought to know that testing a single parameter value is cheaper than looking through a table, and should try that first, but maybe I'm expecting too much of it (for once). Thank you everyone for the input.


You're thinking procedurally. SQL Server has to generate an execution plan that can take any value or none for any given parameter. It's not a runtime environment, like .NET or JAVA, it's a "declarative language". So when you say, "The variable can be this or can be that", it has to build a single plan that handles both of those correctly. To the optimizer, "correct" is more important than "efficient", so if you give it a query that can have wildly different "most efficient means to execute", you end up with a plan that has to cover all eventualities, and thus will be less efficient.

There are ways to work around that. Various flavors of parameterized dynamic SQL, nested procs, With Recompile, etc., are all popular and effective methods of essentially generating a different query for different parameter combinations.

I tend to use dynamic SQL for that kind of thing.

DECLARE @SQL NVARCHAR(max) = '
SELECT *
FROM dbo.MyTable
WHERE 1 = 1';

IF @Parameter1 IS NOT NULL
SET @SQL += '
AND MyColumn = @Parameter1';

EXEC sp_executeSQL @SQL, '@Parameter1 INT', @Parameter1;

That kind of thing. Don't actually build the values into the string, pass them in as parameters.

Then, if a parameter value is null or blank, it simply doesn't get added to the Where clause, and you end up with a highly optimizable query, containing only those criteria that actually matter at runtime. No over-broad execution plans because of CASE or OR in Where.


- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
Post #1389327
Posted Tuesday, November 27, 2012 4:09 PM


Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Wednesday, July 23, 2014 2:54 PM
Points: 326, Visits: 926
GSquared (11/27/2012)
pdanes (11/27/2012)
There doesn't seem to be any clear consensus on all this, so I guess I'll have to build a bunch of test cases and see what comes of it. Still seems to me that the SS engine ought to know that testing a single parameter value is cheaper than looking through a table, and should try that first, but maybe I'm expecting too much of it (for once). Thank you everyone for the input.

I tend to use dynamic SQL for that kind of thing.

I've been a little paranoid about dynamic SQL since reading all the stories about injection attacks, but this method of building parameters looks safe enough. I'll try that as well - thanks for the sample code.
Post #1389454
Posted Tuesday, November 27, 2012 4:46 PM


SSC Eights!

SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!

Group: General Forum Members
Last Login: Sunday, July 20, 2014 7:16 AM
Points: 945, Visits: 1,769
pdanes (11/27/2012)
GSquared (11/27/2012)
pdanes (11/27/2012)
There doesn't seem to be any clear consensus on all this, so I guess I'll have to build a bunch of test cases and see what comes of it. Still seems to me that the SS engine ought to know that testing a single parameter value is cheaper than looking through a table, and should try that first, but maybe I'm expecting too much of it (for once). Thank you everyone for the input.

I tend to use dynamic SQL for that kind of thing.

I've been a little paranoid about dynamic SQL since reading all the stories about injection attacks, but this method of building parameters looks safe enough. I'll try that as well - thanks for the sample code.


i found http://www.sommarskog.se/dynamic_sql.html to be very helpful to my understanding of dynamic SQL. After reading that article several times i am confident i can write injection free code.



For faster help in answering any problems Please read How to post data/code on a forum to get the best help - Jeff Moden for the best way to ask your question.

For performance Issues see how we like them posted here: How to Post Performance Problems - Gail Shaw

Need to Split some strings? Jeff Moden's DelimitedSplit8K
Jeff Moden's Cross tab and Pivots Part 1
Jeff Moden's Cross tab and Pivots Part 2

Jeremy Oursler
Post #1389468
Posted Wednesday, November 28, 2012 6:36 AM


SSCoach

SSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoach

Group: General Forum Members
Last Login: Friday, June 27, 2014 12:43 PM
Points: 15,444, Visits: 9,596
pdanes (11/27/2012)
GSquared (11/27/2012)
pdanes (11/27/2012)
There doesn't seem to be any clear consensus on all this, so I guess I'll have to build a bunch of test cases and see what comes of it. Still seems to me that the SS engine ought to know that testing a single parameter value is cheaper than looking through a table, and should try that first, but maybe I'm expecting too much of it (for once). Thank you everyone for the input.

I tend to use dynamic SQL for that kind of thing.

I've been a little paranoid about dynamic SQL since reading all the stories about injection attacks, but this method of building parameters looks safe enough. I'll try that as well - thanks for the sample code.


Injection-safety is a large part of why MS added sp_executeSQL to SQL Server. Much safer than Exec(), since strings are passed in as values instead of executed directly.

No way to inject with this method.


- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
Post #1389711
« Prev Topic | Next Topic »

Add to briefcase ««123»»

Permissions Expand / Collapse