March 18, 2008 at 7:00 am
If all the columns referenced in the where clause of a query are part of an index, does the sequence of the columns affect the performance of the query?
For example, suppose a table has an index on FieldA, FieldB, and FieldC (in that order).
A query's where clause is written as:
Where FieldC = and
FieldB = and
FieldA =
Would reworking this query as:
Where FieldA = and
FieldB = and
FieldC =
be expected to improve performance?
Thanks
March 18, 2008 at 7:43 am
Hi Dan,
It doesn't make a difference which order the where clause is written in. So as long as it's logically the same you're ok.
March 18, 2008 at 7:57 am
Thanks
March 19, 2008 at 3:50 pm
Note that it is VERY important which order the columns are placed in the index however. Assuming all 3 columns are referenced in all queries, you should put the columns in left-to-right order of decreasing specificity for optimal performance.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
March 19, 2008 at 4:01 pm
Thanks.
March 19, 2008 at 4:39 pm
Column order in predicates is only relevant if SQL Server has no statistics available on the referenced columns.
[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]
March 20, 2008 at 12:18 am
rbarryyoung (3/19/2008)
Column order in predicates is only relevant if SQL Server has no statistics available on the referenced columns.
Could you explain more please.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
March 20, 2008 at 8:07 am
It probably doesn't matter as long as the default setting of Autocreate statistics is adhered to and left ON. Then the optimizer will likely gather statistics and come up with best plan anyway.
If it doesn't do this, IIRC the optimizer estimates 30% of table is returned and picks query plan based off of this number. This could easily be misremembered tho.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
March 20, 2008 at 10:38 am
Gail, my understanding is that, in the abscence of statistics, the optimizer will, having nothing else to go on, order the predicates as they were ordered in the query statement.
[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]
March 25, 2008 at 12:45 am
rbarryyoung (3/20/2008)
Gail, my understanding is that, in the abscence of statistics, the optimizer will, having nothing else to go on, order the predicates as they were ordered in the query statement.
Well if there are no stats at all and no indexes, SQL will just do a table scan and evaluate all of the predicates at the same time.
If you've managed to get indexes without stats, then SQL will evaluate the predicates based on the available index if it's guess as to affected rows (30% of the total rows in the table) shows the index cheaper than a table scan. If not, then it will resort to a table scan as above.
OK, some code to test things out. Run this on a test server. I have a database called Testing that I use for this kinda thing.
ALTER DATABASE Testing
SET AUTO_CREATE_STATISTICS off, AUTO_UPDATE_STATISTICS Off
GO
Use Testing
Go
Create TABLE TestingPredicates (
id INT,
StatusCode CHAR(1),
SomeOddDate Datetime
)
GO
;WITH Populate (ID, Code, SomeDate) AS (
SELECT 1 AS ID, CHAR(65) AS Code, CAST('1980/01/01' AS DATETIME) AS SomeDate
UNION ALL
SELECT ID + 1, CHAR(65+FLOOR(RAND(ID*5122)*15)), DATEADD(dd,FLOOR(RAND(ID*84215)*5),SomeDate) FROM Populate
WHERE ID<10000
)
INSERT INTO TestingPredicates (ID, StatusCode, SomeOddDate)
SELECT * FROM Populate
OPTION (MAXRECURSION 10000)
First test. No indexes, no stats
SELECT * FROM TestingPredicates
WHERE someOddDate BETWEEN '2008/01/01' AND '2008/03/31' AND StatusCode = 'F'
Execution plan - Table scan with Predicate.
[Testing].[dbo].[TestingPredicates].[SomeOddDate]>=CONVERT_IMPLICIT(datetime,[@1],0)
AND [Testing].[dbo].[TestingPredicates].[SomeOddDate]<=CONVERT_IMPLICIT(datetime,[@2],0)
AND [Testing].[dbo].[TestingPredicates].[StatusCode]=[@3]
2nd Test. Query other way round.
SELECT * FROM TestingPredicates
WHERE StatusCode = 'F' AND someOddDate BETWEEN '2008/01/01' AND '2008/03/31'
Execution plan - Table scan with Predicate. Same predicate, just different names for the automatically created parameters
[Testing].[dbo].[TestingPredicates].[SomeOddDate]>=CONVERT_IMPLICIT(datetime,[@2],0)
AND [Testing].[dbo].[TestingPredicates].[SomeOddDate]<=CONVERT_IMPLICIT(datetime,[@3],0)
AND [Testing].[dbo].[TestingPredicates].[StatusCode]=[@1]
Now, an index on those 2 columns...
CREATE INDEX idx_TestingStuff ON TestingPredicates (StatusCode, someOddDate)
Now I have an index that appears to have no statistics. Interesting....
First a test with Select *
SELECT * FROM TestingPredicates
WHERE someOddDate BETWEEN '2008/01/01' AND '2008/03/31' AND StatusCode = 'F'
Table scan with predicate. Apparently the potential cost of looking up a column from the heap for 30% of the table is too expensive. (Estimated rows 60, actual rows 2)
Same form for the predicate as before, except it's not parameterised this time.
[Testing].[dbo].[TestingPredicates].[SomeOddDate]>='2008-01-01 00:00:00.000'
AND [Testing].[dbo].[TestingPredicates].[SomeOddDate]<='2008-03-31 00:00:00.000'
AND [Testing].[dbo].[TestingPredicates].[StatusCode]='F'
And last test. Remove the * so that the index is covering...
SELECT someOddDate, StatusCode FROM TestingPredicates
WHERE someOddDate BETWEEN '2008/01/01' AND '2008/03/31' AND StatusCode = 'F'
And now we have an index seek, though the estimation is still off. Seek predicates (as expected) in the order of columns in the index.
Prefix: [Testing].[dbo].[TestingPredicates].StatusCode = Scalar Operator([@3]),
Start Range: [Testing].[dbo].[TestingPredicates].SomeOddDate >= Scalar Operator(CONVERT_IMPLICIT(datetime,[@1],0)),
End Range: [Testing].[dbo].[TestingPredicates].SomeOddDate <= Scalar Operator(CONVERT_IMPLICIT(datetime,[@2],0))
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
Viewing 10 posts - 1 through 10 (of 10 total)
You must be logged in to reply to this topic. Login to reply