Querying an indexed table

  • If all the columns referenced in the where clause of a query are part of an index, does the sequence of the columns affect the performance of the query?

    For example, suppose a table has an index on FieldA, FieldB, and FieldC (in that order).

    A query's where clause is written as:

    Where FieldC = and

    FieldB = and

    FieldA =

    Would reworking this query as:

    Where FieldA = and

    FieldB = and

    FieldC =

    be expected to improve performance?

    Thanks

  • Hi Dan,

    It doesn't make a difference which order the where clause is written in. So as long as it's logically the same you're ok.

  • Thanks

  • Note that it is VERY important which order the columns are placed in the index however. Assuming all 3 columns are referenced in all queries, you should put the columns in left-to-right order of decreasing specificity for optimal performance.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Thanks.

  • Column order in predicates is only relevant if SQL Server has no statistics available on the referenced columns.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • rbarryyoung (3/19/2008)


    Column order in predicates is only relevant if SQL Server has no statistics available on the referenced columns.

    Could you explain more please.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • It probably doesn't matter as long as the default setting of Autocreate statistics is adhered to and left ON. Then the optimizer will likely gather statistics and come up with best plan anyway.

    If it doesn't do this, IIRC the optimizer estimates 30% of table is returned and picks query plan based off of this number. This could easily be misremembered tho.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Gail, my understanding is that, in the abscence of statistics, the optimizer will, having nothing else to go on, order the predicates as they were ordered in the query statement.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • rbarryyoung (3/20/2008)


    Gail, my understanding is that, in the abscence of statistics, the optimizer will, having nothing else to go on, order the predicates as they were ordered in the query statement.

    Well if there are no stats at all and no indexes, SQL will just do a table scan and evaluate all of the predicates at the same time.

    If you've managed to get indexes without stats, then SQL will evaluate the predicates based on the available index if it's guess as to affected rows (30% of the total rows in the table) shows the index cheaper than a table scan. If not, then it will resort to a table scan as above.

    OK, some code to test things out. Run this on a test server. I have a database called Testing that I use for this kinda thing.

    ALTER DATABASE Testing

    SET AUTO_CREATE_STATISTICS off, AUTO_UPDATE_STATISTICS Off

    GO

    Use Testing

    Go

    Create TABLE TestingPredicates (

    id INT,

    StatusCode CHAR(1),

    SomeOddDate Datetime

    )

    GO

    ;WITH Populate (ID, Code, SomeDate) AS (

    SELECT 1 AS ID, CHAR(65) AS Code, CAST('1980/01/01' AS DATETIME) AS SomeDate

    UNION ALL

    SELECT ID + 1, CHAR(65+FLOOR(RAND(ID*5122)*15)), DATEADD(dd,FLOOR(RAND(ID*84215)*5),SomeDate) FROM Populate

    WHERE ID<10000

    )

    INSERT INTO TestingPredicates (ID, StatusCode, SomeOddDate)

    SELECT * FROM Populate

    OPTION (MAXRECURSION 10000)

    First test. No indexes, no stats

    SELECT * FROM TestingPredicates

    WHERE someOddDate BETWEEN '2008/01/01' AND '2008/03/31' AND StatusCode = 'F'

    Execution plan - Table scan with Predicate.

    [Testing].[dbo].[TestingPredicates].[SomeOddDate]>=CONVERT_IMPLICIT(datetime,[@1],0)

    AND [Testing].[dbo].[TestingPredicates].[SomeOddDate]<=CONVERT_IMPLICIT(datetime,[@2],0)

    AND [Testing].[dbo].[TestingPredicates].[StatusCode]=[@3]

    2nd Test. Query other way round.

    SELECT * FROM TestingPredicates

    WHERE StatusCode = 'F' AND someOddDate BETWEEN '2008/01/01' AND '2008/03/31'

    Execution plan - Table scan with Predicate. Same predicate, just different names for the automatically created parameters

    [Testing].[dbo].[TestingPredicates].[SomeOddDate]>=CONVERT_IMPLICIT(datetime,[@2],0)

    AND [Testing].[dbo].[TestingPredicates].[SomeOddDate]<=CONVERT_IMPLICIT(datetime,[@3],0)

    AND [Testing].[dbo].[TestingPredicates].[StatusCode]=[@1]

    Now, an index on those 2 columns...

    CREATE INDEX idx_TestingStuff ON TestingPredicates (StatusCode, someOddDate)

    Now I have an index that appears to have no statistics. Interesting....

    First a test with Select *

    SELECT * FROM TestingPredicates

    WHERE someOddDate BETWEEN '2008/01/01' AND '2008/03/31' AND StatusCode = 'F'

    Table scan with predicate. Apparently the potential cost of looking up a column from the heap for 30% of the table is too expensive. (Estimated rows 60, actual rows 2)

    Same form for the predicate as before, except it's not parameterised this time.

    [Testing].[dbo].[TestingPredicates].[SomeOddDate]>='2008-01-01 00:00:00.000'

    AND [Testing].[dbo].[TestingPredicates].[SomeOddDate]<='2008-03-31 00:00:00.000'

    AND [Testing].[dbo].[TestingPredicates].[StatusCode]='F'

    And last test. Remove the * so that the index is covering...

    SELECT someOddDate, StatusCode FROM TestingPredicates

    WHERE someOddDate BETWEEN '2008/01/01' AND '2008/03/31' AND StatusCode = 'F'

    And now we have an index seek, though the estimation is still off. Seek predicates (as expected) in the order of columns in the index.

    Prefix: [Testing].[dbo].[TestingPredicates].StatusCode = Scalar Operator([@3]),

    Start Range: [Testing].[dbo].[TestingPredicates].SomeOddDate >= Scalar Operator(CONVERT_IMPLICIT(datetime,[@1],0)),

    End Range: [Testing].[dbo].[TestingPredicates].SomeOddDate <= Scalar Operator(CONVERT_IMPLICIT(datetime,[@2],0))

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 10 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply