September 4, 2008 at 12:40 pm
It can 'short-circuit' but not on the basis of where the expressions are in the where clause, rather which indexes are used and which expressions those indexes can 'evaluate'
If we say have a query with a where clause
WHERE B > 8 and A=16
and there's a nonclustered index on A that the optimiser chooses to use, then the B > 8 will only be evaluated on the rows that the index returns, ie the rows where A=16
You could call that a form of short-circuit. I prefer not to as the phrase brings to mind what C# and other languages do, which is a position-based short-circuit.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
September 4, 2008 at 1:14 pm
I have written a brief description of this exact problem here:
I have lost count of the number of times I've seen this, and at first sight it looks like a good idea. But the fact is, it won't use an index, so an alternative like dynamic SQL has got to be considered.
I haven't tried the COALESCE solution. I can feel a test coming on π
September 4, 2008 at 1:31 pm
Glen (9/4/2008)
Gail,would it be then proper to consider that on a table with search on an unindexed column the
'WHERE @param = something or field = @param" and
"WHERE field = @param or @param = something"
will be returning the same elapsed time and CPU time in server execution times?
Indeed they will
I doubt. I can run some tests, but would assume that the second where clause should have CPU time and server execution higher than first.
Run on SQL 2008 RTM on a Vista machine, 4 procs, 4 GB memory
Create table WhereclauseOrder (
somestring char(1)
)
insert into WhereclauseOrder (somestring)
SELECT TOP 1000000
CHAR(65+floor(rand((s1.number + s2.number)*4823)*8))
FROM master..spt_values s1 CROSS JOIN master..spt_values s2
WHERE s1.[name] IS NULL AND s2.NAME IS null
set statistics time on
go
DECLARE @param char(1)
SET @param = 'F'
select * from WhereclauseOrder where somestring = @param or @param = 'Z'
select * from WhereclauseOrder where @param = 'Z' or somestring = @param
First query: CPU time = 531 ms, elapsed time = 1149 ms.
Second query: CPU time = 531 ms, elapsed time = 1643 ms.
I ran it a few times , there's small variations in the CPU time each time. Sometimes the first is 50 or so ms faster, sometimes the second is. If you look at the exec plan, both show a tablescan with both of the conditions of the OR as predicates on the scan.
But that's not a proper test. So, this needs to be included as well (same table)
DECLARE @param char(1)
SET @param = 'Z'
select * from WhereclauseOrder where somestring = @param or @param = 'Z'
select * from WhereclauseOrder where @param = 'Z' or somestring = @param
Query 1: CPU time = 2109 ms, elapsed time = 8344 ms.
Query 2: CPU time = 2094 ms, elapsed time = 8214 ms.
Again, there were variations of 100ms or so across the different executions. Again, the execution plans of the two are identical and both show a tablescan with both of the conditions of the OR as predicates on the scan.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
September 4, 2008 at 11:57 pm
Come across something similar in SQL2000 with LEFT JOINS before (bug). Not sure which service pack resolves it, however if one of the columns contains a null value, it slows the query right down.
eg: tbl1 t1 LEFT JOIN tbl2 t2 ON t1.col1 = t2.col1, if t2.col1 is null CPU usage hits the roof.
To get round this need:
tbl1 t1 LEFT JOIN tbl2 t2 ON t1.col1 = ISNULL(t2.col1,0)
May not be your issue - depending on your service pack, but worthwhile knowing.
Qu's:
Has anything changed on your server/DB Service pack since this changed was noticed?
Any chance of seeing the sql last time the sp ran successfully?
Does the table/s the query is run against have indexing/high degree of fragmentation?
On another note:
May be an idea to move the results set into a table variable rather than a temp table, if a temp table is needed at all.
Put parenthasis around the left join and statements - will help sql serv2000.
September 5, 2008 at 8:33 am
GilaMonster (9/4/2008)
If the data is not changing a lot, then you are probably not running into lock waits and hence nolock won't help much. Nolock is not a silver bullet to be added to every select statement without careful consideration.
So, if I'm running reports on a server that's refreshed once every 24 hrs, but multiple users are performing reads (no updates, inserts or deletes allowed in anything but tempdb), am I gaining anything at all in using NOLOCK? Our DBA recently suggested to the reporting analysts that everyone use NOLOCK every time. Sounds like he thinks it's a silver bullet.
---------------------------------------------------------
How best to post your question[/url]
How to post performance problems[/url]
Tally Table:What it is and how it replaces a loop[/url]
"stewsterl 80804 (10/16/2009)I guess when you stop and try to understand the solution provided you not only learn, but save yourself some headaches when you need to make any slight changes."
September 5, 2008 at 9:54 am
Glen (9/5/2008)
And shouldn't the variation of "100 ms" in execution be random ?
In the test I posted the variations were indeed completely random. I'll test later with a couple of params and again with a couple of columns.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
September 5, 2008 at 11:25 am
Ok, with 3 columns, populated much the same way as the first. I interleaved the queries to make it as equal as possible
set statistics time on
go
DECLARE @param char(1), @param1 char(1), @param2 char(1)
SET @param = 'A'
SET @param1 = 'B'
SET @param2 = 'Z'
Print 'Query 1'
select * from WhereclauseOrder
where (somestring = @param or @param = 'Z')
and (somestring1 = @param1 or @param1 = 'Z')
and (somestring2 = @param2 or @param2 = 'Z')
Print 'Query 2'
select * from WhereclauseOrder
where (@param = 'Z' or somestring = @param)
and (@param1 = 'Z' or somestring1 = @param1)
and (@param2 = 'Z' or somestring2 = @param2)
Go 5
Stats time output (with the unnecessary stuff removed)
Iteration 1
Query 1
SQL Server Execution Times:
CPU time = 1406 ms, elapsed time = 1458 ms.
Query 2
SQL Server Execution Times:
CPU time = 1406 ms, elapsed time = 1548 ms.
Iteration 2
Query 1
CPU time = 1375 ms, elapsed time = 1423 ms.
Query 2
CPU time = 1391 ms, elapsed time = 1513 ms.
Iteration 3
Query 1
SQL Server Execution Times:
CPU time = 1422 ms, elapsed time = 1448 ms.
Query 2
SQL Server Execution Times:
CPU time = 1390 ms, elapsed time = 1458 ms.
Iteration 4
Query 1
SQL Server Execution Times:
CPU time = 1391 ms, elapsed time = 1433 ms.
Query 2
SQL Server Execution Times:
CPU time = 1406 ms, elapsed time = 1473 ms.
Iteration 5
Query 1
SQL Server Execution Times:
CPU time = 1375 ms, elapsed time = 1406 ms.
Query 2
SQL Server Execution Times:
CPU time = 1438 ms, elapsed time = 1487 ms.
The thing is, the optimiser is free to pick the method that the query will be executed at the time it compiles the query, so you may sometimes get the constant expressions evaluated only, other times you may not. There is no guarantee of the order that the predicates will be executed. It can and will change depending on the data in the table, the predicates in the query and possibly other server conditions.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
September 5, 2008 at 12:53 pm
jomills,
I'm fairly new to the database game, so excuse my question if it is obvious. Why are you joining to ord_deal as a sub query instead of joining directly to the table? Since the sub query is (SELECT * FROM ORD_DEAL), wouldn't it be better to omit the sub query and join to the table?
September 5, 2008 at 1:49 pm
Jay_Noob (9/5/2008)
jomills,I'm fairly new to the database game, so excuse my question if it is obvious. Why are you joining to ord_deal as a sub query instead of joining directly to the table? Since the sub query is (SELECT * FROM ORD_DEAL), wouldn't it be better to omit the sub query and join to the table?
This is piece of code that has been around since 2005 and modified and updated by various DBAs. I suspect the original code had a limiting WHERE statement in it at some point in the past, and the person who updated it simply removed the condition as opposed to adding the table to the FROM clause directly. π I've not tried your suggestion to be certain, but I would suspect there could be some improvement.
September 10, 2008 at 3:28 am
Glen (9/4/2008)
I remember reading in one of the blogs that SQL Server does short-circuit evaluation, but it was not specified under what conditions.
I dont agree here Gila coz SQL does short-circuit evalution. Try this code
select 'x' where 1=0 and 1/0 = 0
September 10, 2008 at 5:07 am
I've just tried this on SQL Server 2005 Standard
select 'x' where 1=0 and 1/0=0
select 'x' where 1/0=0 and 1=0
select 'x' where 1=1 and 1/0=0
select 'x' where 1/0=0 and 1=1
The first two return empty result sets, the second two return "divide by zero" errors.
This proves that SQL Server does do short circuit evaluation, but the order is not important. The optimizer is cleverer than that.
September 10, 2008 at 5:13 am
Richard Fryar (9/10/2008)
I've just tried this on SQL Server 2005 Standard
select 'x' where 1=0 and 1/0=0
select 'x' where 1/0=0 and 1=0
select 'x' where 1=1 and 1/0=0
select 'x' where 1/0=0 and 1=1
The first two return empty result sets, the second two return "divide by zero" errors.
This proves that SQL Server does do short circuit evaluation, but the order is not important. The optimizer is cleverer than that.
On SQL Server 2k, the order is important:
select 'x' where sqrt(3)=3 and 1/0=0
select 'x' where 1/0=0 and sqrt(3)=3
The first query returns an empty result set, the second fails with "divide by zero" error π
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
September 10, 2008 at 5:36 pm
Getting back to the question of why the query performs differently in a procedure, an execution plan must be chosen when the procedure is compiled with no knowledge of the run-time parameter value. It's not "confused" by "WHERE col1 = @param1 OR @param1 = 0", it just has to decide on one plan based on limited information. It will probably assume a non-zero parameter value and choose to do an index scan, possibly with a bookmark lookup. When a zero argument is passed to the procedure a table scan (or a different index) would probably be much more efficient.
When the query is used in a batch, the parameter values are known when the statement is compiled and the most efficient execution plan may be chosen.
The "WHERE column = COALESCE(@param, column)" version (with a NULL value for @param instead of zero) often works better because the use of a function will probably make the query optimizer overlook this clause when choosing indexes. It doesn't give you the best plan in all cases, but it should avoid the worst-case situations in the original post.
With a small number of optional parameters, you can create a procedure with every variation of the query using "WHERE col1 = @param1", "WHERE col2 = @param2", "WHERE col1 = @param1 AND col2 = @param2", etc. and use "IF @param1 = 0", "IF @param2 = 0" statements to execute the correct one. This quickly leads to a maintenance nightmare if more parameters are added, but it will guarantee that the optimum plan will be used for any combination of arguments.
Dynamic SQL is often the best solution, as it can handle more parameters and the cost of compiling the statement is less than the cost of waiting forever for a bad execution plan to execute.
September 11, 2008 at 8:52 am
Thanks Scott,
I had to come up with a temporary work around while I worked on the problem of this Stored proc, I will incorporate some of your suggestions in the orig proc and see how it turns out.
-Joseph
September 11, 2008 at 11:43 am
Since you're using 2005, a relatively low cost fix (instead of dynamic SQL) might simply be to force that select statement to recompile each and every single time. Meaning - add OPTION (RECOMPILE) to the select statement. this will force a statement level recompile (which tends to be really fast), giving you the best exec plan during each run.
Interestingly enough - if you look up the Query Hints entry in BOL, the example shown is something that looks to be almost a clone of your example with the OPTION (RECOMPILE) hint.
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
Viewing 15 posts - 16 through 30 (of 30 total)
You must be logged in to reply to this topic. Login to reply