Introduction to Indexes: Part 3 – The nonclustered index

  • Nadrek (1/7/2011)


    I didn't see any explanation of how, when, and why one would consider equality vs inequality of matches, and that's one of the areas I definitely need to learn more about.

    http://sqlinthewild.co.za/index.php/2009/01/19/index-columns-selectivity-and-equality-predicates/

    http://sqlinthewild.co.za/index.php/2009/02/06/index-columns-selectivity-and-inequality-predicates/

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • That makes sense. Thanks so much for the help and the fast reply!

  • great article . its very useful

    thanks !

  • Awesome, Gail, thanks a ton.

    Do you have the same screenshot for the last two query plans? They both look exactly the same and say they're query 1. But your description I believe refers to a difference.

  • I'd have to check, but I doubt I still have the code and screenshots available if it is the wrong one.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Oh, if only I had seen these articles the first time round, so much blood, sweat and tears would have been saved (especially the latter).

    Thank you Gail, excellent articles. I expect to be referring to these regularly.

  • "Include columns are useful in that they are columns available in the index but not contributing to the size of the index key"

    I don't think this is true. I worked on a table with 45M rows, which had about 18 non clustered indexes. Several of them were created with included columns(and some of them had lots) and their size was enormous. More than half the size of that db was in this table's indexes only. I dropped the 3 largest of them and I freed loads of space.

    Included columns do affect the index size. I must mention that those indexes were created in time by using the tuning advisor tool in MS for different queries by applying the recommendations. This tool might help with specific queries but they end up creating these huge indexes with included columns, which endup eating lots of space and actually killing performance.

    Don't just give the hungry man a fish, teach him how to catch it as well.

    the sqlist

  • the sqlist (1/21/2011)


    "Include columns are useful in that they are columns available in the index but not contributing to the size of the index key"

    ...

    Included columns do affect the index size.

    I never said they don't affect the index size. I said they don't affect the size of the index key, which they do not because they are not part of the index key and hence do not count towards the 900 byte limit on the index key.

    They are stored only at the leaf level of the index, not at the intermediate levels but since they are stored at the leaf level, they are part of the index and hence will contribute to the total size of the index on disk.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (1/21/2011)


    the sqlist (1/21/2011)


    "Include columns are useful in that they are columns available in the index but not contributing to the size of the index key"

    ...

    Included columns do affect the index size.

    I never said they don't affect the index size. I said they don't affect the size of the index key, which they do not because they are not part of the index key and hence do not count towards the 900 byte limit on the index key.

    They are stored only at the leaf level of the index, not at the intermediate levels but since they are stored at the leaf level, they are part of the index and hence will contribute to the total size of the index on disk.

    I agree, I missed the nuance, but the important thing here and worth mentioning is the overall index size. Key of course remains the same.

    Don't just give the hungry man a fish, teach him how to catch it as well.

    the sqlist

  • Great series Gail. Learned a lot as well as a great refresher.

    Steve Jimmo
    Sr DBA
    “If we ever forget that we are One Nation Under God, then we will be a Nation gone under." - Ronald Reagan

  • Great series.. They bring back some forgotten concepts. Thanks

    I do have a question on the 'Include' feature in nonclustered indexes. You touch on how to use it in your article, which I agree. I have come across some articles and even from the SQL optimzer execution plan about how the 'include' is used in a totally different way than you describe, and I like to get your opinion on.

    The opinion is that in a query that consists of join and where clause. We should put the SGR in the where clause as the key field in the nonclused index and put the fields used in the join in the 'include'. When I was turning a query and sometimes query optimizer in SQL 2008 would recognize a missing index and suggest a new one, the new index actually is constructed as what the opinion mentioned.

    Sometimes this type of nonclustered index is quite confusing to me as it contradicts the genernal understanding as you describe. Furthermore, if the where clause consists of very low selectivity field such as status, the optimizer still suggests the status field as the main key field in the index and put other high selectivity columns in the 'Include' field. This really doesn't make sense to me.

    Let me know your thoughts or if others have better understanding of this, feel free to chime in.

    W

  • OceanDeep (1/21/2011)


    The opinion is that in a query that consists of join and where clause. We should put the SGR in the where clause as the key field in the nonclused index and put the fields used in the join in the 'include'.

    I would put both as key columns. They're both potentially needed as search arguments, hence they belong in the key. Include is for columns that are selected but not searched (or joined) on

    Sometimes this type of nonclustered index is quite confusing to me as it contradicts the genernal understanding as you describe. Furthermore, if the where clause consists of very low selectivity field such as status, the optimizer still suggests the status field as the main key field in the index and put other high selectivity columns in the 'Include' field. This really doesn't make sense to me.

    Are those 'high selective' columns used in the where clause? If they're only in the select, there's no point in making them key columns as they aren't search arguments.

    Bear in mind that the missing index suggestions are just that. Suggestions. Often wrong because the optimiser's only ever looking at one query at a time.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Very good article. I do have a question. I am aware of SQL Server creating intersecting indexes when you have a query on multiple columns and each column has its own index. What I had not heard before is that SQL Server may not necessarily use that schema, but only use one and do a scan for remaining values.

    I am on SQL Server 2000. I am not much of an execution plan expert, but would like to know what to look for in the execution plan to determine if it is using a single index or multiple indexes.

    We have a lot of queries where we join to multiple tables on multiple columns. Most of these tables have individual indexes on the join columns, and before I heard about intersecting indexes, I was under the impression that only one index would be used, thus compound indexes should be used. I'd like to prove or disprove this 'hypothesis' by looking at some execution plans for some of the queries.

  • http://sqlinthewild.co.za/index.php/2010/09/14/one-wide-index-or-multiple-narrow-indexes/

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Doug Bishop (1/21/2011)


    Very good article. I do have a question. I am aware of SQL Server creating intersecting indexes when you have a query on multiple columns and each column has its own index. What I had not heard before is that SQL Server may not necessarily use that schema, but only use one and do a scan for remaining values.

    I am on SQL Server 2000. I am not much of an execution plan expert, but would like to know what to look for in the execution plan to determine if it is using a single index or multiple indexes.

    We have a lot of queries where we join to multiple tables on multiple columns. Most of these tables have individual indexes on the join columns, and before I heard about intersecting indexes, I was under the impression that only one index would be used, thus compound indexes should be used. I'd like to prove or disprove this 'hypothesis' by looking at some execution plans for some of the queries.

    Obviously the preferred way is to use the composite indexes. If you use the composite index you'll get the result in one trip using only one index. The problem is if the columns in the composite key may not be used all in all queries. My approach is this:

    For the join always use the PKs(in general clustered), which also define the relations between tables. For any other filtering you need use non clustered composite indexes.

    Let's say you have a table with an ID column as PK with multiple columns of which 3 , A, B and C, are frequently used in where clauses. Because it could be possible to use any combination of these 3 keys(could be more) and not necessarily all 3 of them each time what I do is creating 3 indexes like this:

    A+B+C

    B+C

    C

    This configuration covers pretty much any combination and the pattern could be used, if needed, with any number of columns.

    Don't just give the hungry man a fish, teach him how to catch it as well.

    the sqlist

Viewing 15 posts - 46 through 60 (of 92 total)

You must be logged in to reply to this topic. Login to reply