Index Usage Question

  • Hi

    Something interesting, to me anyway, that I've noticed about index usage that I would like to know more about. When I do an initial SELECT, it scans the clustered primary key. When I create both of the new indices and do either of the SELECTs, the db engine uses the index that I created last. Why?

    Code to reproduce:

    CREATE TABLE dbo.Tmp_bob

    (

    pkColA varchar(15) NOT NULL,

    otherCol float(53) NULL,

    pkColB varchar(3) NOT NULL,

    otherCols datetime NULL

    )

    GO

    ALTER TABLE dbo.Tmp_bob ADD CONSTRAINT

    PK_Tmp_bob PRIMARY KEY CLUSTERED

    (

    pkColA,

    pkColB

    ) WITH( STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    GO

    SELECT pkColA FROM Tmp_bob

    -- It scans the clustered primay key.

    CREATE NONCLUSTERED INDEX IX_Tmp_bob_pkColA ON Tmp_bob(pkColA) WITH PAD_INDEX, FILLFACTOR = 80

    CREATE NONCLUSTERED INDEX IX_Tmp_bob_pkColB ON Tmp_bob(pkColB) WITH PAD_INDEX, FILLFACTOR = 80

    SELECT pkColA FROM Tmp_bob

    -- It scans IX_Tmp_bob_pkColB

    DROP INDEX Tmp_bob.IX_Tmp_bob_pkColA

    DROP INDEX Tmp_bob.IX_Tmp_bob_pkColB

    CREATE NONCLUSTERED INDEX IX_Tmp_bob_pkColB ON Tmp_bob(pkColB) WITH PAD_INDEX, FILLFACTOR = 80

    CREATE NONCLUSTERED INDEX IX_Tmp_bob_pkColA ON Tmp_bob(pkColA) WITH PAD_INDEX, FILLFACTOR = 80

    SELECT pkColA FROM Tmp_bob

    -- It scans IX_Tmp_bob_pkColA

    Thanks.

  • Because the last nonclustered index that you added is covering (contains all the columns that the query needs) and is smaller than the cluster (because the cluster is the table). Hence scanning that NC index is faster and requires less IOs than scanning the cluster.

    In fact, both of the nonclustered indexes are covering, since a nonclustered index includes the clustering key. The two indexes are, for that query, equivalent and hence SQL can use either.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Yes you are right .A bit strange (to us that why the index that is created second is used)..but there might be some logic we dont know .

    main thing is , will it perform the same way once some good amount of data is in inside the table .

    I have not done any stuff to populate this table .

    Might try , but very occupied as of now .

    Regards

    Abhay Chaudhary
    Sr.DBA (MCITP/MCTS :SQL Server 2005/2008 ,OCP 9i)

  • It's also worth noting that if there is a choice of covering indexes that can be used to return the results, the one occupying the smallest number of pages is chosen.

    You can test this on any table with a clustered primary key and a few other indexes of varying sizes thus:

    SELECT

    ColumnAtTopOfPrimaryKey

    FROM

    TheTable

    I wouldn't recommend testing behaviour on empty tables either, because I think it can produce misleading results.

    Tim

    .

  • Tim Walker (6/25/2009)


    I wouldn't recommend testing behaviour on empty tables either, because I think it can produce misleading results.

    Indeed. Reading 0 pages is pretty quick no matter what index.

    Load the table up with a million or so rows and, provided that the length of the strings in those varchars are close to the declared value, I suspect that the index with B as the key col will most likely be favoured. It's close and the difference is unlikely to appear on small-mid sized tables

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • I just tested with both ten thousand and a million rows. The results are different now - only IX_Tmp_bob_pkColB is used, independent of index creation order. If I DROP IX_Tmp_bob_pkColB then IX_Tmp_bob_pkColA is used.

    Why would it use IX_Tmp_bob_pkColB if I haven't included any reference/data/information about pkColA in it? ("RTFM on covering indices!"?)

  • You have included it - you just didn't realise!

    Where you have a unique clustered index or primary key an index uses this to point to the data. Your example makes it a bit tricky to see this because you are using your columns for both the PK and the example indexes.

    A clearer example:

    Table contains ColA, ColB, ColC, ColD, ColE

    Clustered PK (or unique clustered index) on ColA, ColB

    Secondary Index on ColC

    What the secondary index actually holds is the relationship between ColC and ColA, ColB.

    So it is a covering index!

    Tim

    .

  • Additionally I am guessing it uses the index based on the varchar(3) column because it occupies less pages than the varchar(15) one.

    🙂

    Tim

    .

  • Thanks for the index relationship info Tim. I also thought that it would prefer to lookup on a smaller index.

  • Tim Walker (6/25/2009)


    Additionally I am guessing it uses the index based on the varchar(3) column because it occupies less pages than the varchar(15) one.

    Depending, of course, on the size of the string stored. If all the strings in the varchar(3) are exactly 3 chars long and all the strings in the varchar(15) are either 2 or 3 characters long, then the index on the varchar(15) will be smaller.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GDI Lord (6/25/2009)


    IWhy would it use IX_Tmp_bob_pkColB if I haven't included any reference/data/information about pkColA in it? ("RTFM on covering indices!"?)

    As I mentioned earlier, the clustering key is always part of all nonclustered indexes, whether you specify them of not. It has to be, it's the 'address' of the row. Hence both pk_ColA and pk_ColB will be part of every single nonclustered index.

    That, incidentally, is why one of the recommendations for a clustered index key is 'narrow'.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (6/25/2009)


    Tim Walker (6/25/2009)


    Additionally I am guessing it uses the index based on the varchar(3) column because it occupies less pages than the varchar(15) one.

    Depending, of course, on the size of the string stored. If all the strings in the varchar(3) are exactly 3 chars long and all the strings in the varchar(15) are either 2 or 3 characters long, then the index on the varchar(15) will be smaller.

    Absolutely right!

    Hey Gail, do you know what would happen if one index was narrower but fragmented so actually occupied more pages than a wider one? Which would be chosen then?

    Tim

    .

  • Tim Walker (6/25/2009)


    Hey Gail, do you know what would happen if one index was narrower but fragmented so actually occupied more pages than a wider one? Which would be chosen then?

    Well, fragmented doesn't necessarily mean more pages, I can have an index that's 99% fragmented but all the pages are close to 100% full.

    Assuming you mean one of the indexes has pages only half full. Haven't tested, don't know for sure, but I suspect it wouldn't factor into the decision. I don't believe that the optimiser knows how fragmented an index is at the time that it compiles the query. I suspect it will base that solely on the statistics of the index which includes the average length of the columns.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (6/25/2009)


    Tim Walker (6/25/2009)


    Hey Gail, do you know what would happen if one index was narrower but fragmented so actually occupied more pages than a wider one? Which would be chosen then?

    Well, fragmented doesn't necessarily mean more pages, I can have an index that's 99% fragmented but all the pages are close to 100% full.

    Assuming you mean one of the indexes has pages only half full. Haven't tested, don't know for sure, but I suspect it wouldn't factor into the decision. I don't believe that the optimizer knows how fragmented an index is at the time that it compiles the query. I suspect it will base that solely on the statistics of the index which includes the average length of the columns.

    Sorry, very sloppy terminology from me. Yes I did mean one that was sparse, e.g. though the original fill factor, page splits, deletions, or whatever!

    I thought about it a bit after I asked the question and I think you must be right. I might test it sometime to find out. Specifying the fill factor to make an index bigger should work I think. I'm sure the optimizer won't consider the original fill factor as it isn't maintained over time.

    Thanks for the reply.

    Tim

    .

  • Thanks Gila!

Viewing 15 posts - 1 through 15 (of 19 total)

You must be logged in to reply to this topic. Login to reply