Covering Index using Included Columns

  • sqlrunner

    Mr or Mrs. 500

    Points: 513

    Comments posted to this topic are about the item Covering Index using Included Columns

    Josef Richberg
    2009 Exceptional DBA
    http://www.josef-richberg.squarespace.com
    http://twitter.com/sqlrunner

  • Selvam Rajendran

    SSC-Addicted

    Points: 485

    Hi Josef Richberg,

    I couldn't see the Real World Examples in this Page. I tried in all browsers.

    Regrds

    selvam R

  • arr.nagaraj

    SSCertifiable

    Points: 6518

    Hi Josef,

    Good write up. Just a bit surprised that you didnt get much into when to use a Included column covering index..ie Index selection.

    I did notice that there was a sentence 'Anything that is in the where clause would be a key, anything in your select statement would be an included column.' Though it works at most of the cases, it could have better had it been explained bit more.

    It would think twice before including a column which is very huge/including too many number of columns, or a frequently updated column. Perhaps this one can be a relevant read.

    http://strictlysql.blogspot.com/2009/11/covering-index-include-nonkey-columns.html

  • SQLRNNR

    SSC Guru

    Points: 281243

    Thanks Josef. Nice concise explanation. The data space requirements is also good information.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • Thomas-428301

    SSC Veteran

    Points: 279

    I like the benefit you have shown of the include clause but it’s not very clear how the pages were accessed at the file level and how there are lesser bytes at intermediate levels when compared.

  • arr.nagaraj

    SSCertifiable

    Points: 6518

    @Thomas-428301,

    Included columns are not present at intermediate levels and are present only at leaf levels. So it saves space. You can verify the same using the DMV

    sys.Dm_db_index_physical_stats which provides details at each level of the index.

  • pslavik

    SSC Journeyman

    Points: 98

    First of all, good article explaining the basics of included columns.

    Unfortunately, the space saving of 48.9% is completely wrong. Yes, you are saving 48.9% space in the intermediate levels (actually little less because of pointers to the next level) but that amounts to about 0.4% for the whole index! 0.4%! That's it. Not worth talking about since fragmentation will cost you much more.

    Anyway, here are more details, somebody please check my math πŸ™‚

    Assume 8kB ~ 8000B per page (it's actually 8096 but this will make the math easier).

    Assume a pointer to the next level is 4B (it is 6B but 4B will make it simpler).

    Assume that a pointer to a table row is also 4B (that depends on the table, in most cases it's more than 4B).

    Ignore any row overhead.

    In any case, these assumptions give advantage to an index with included columns, so no harm done.

    Let's take a table with 1,000,000 rows.

    Full index

    LEAF LEVEL: 1,000,000 leaves, 50B/leaf ==> 6,250 pages

    INTERMEDIATE LEVEL: 6,250 nodes, 50B/node ==> 160 nodes/page ==> 40 pages

    ROOT LEVEL: 40 nodes, 50B/node ==> 160 nodes/page ==> 1 page

    Total# pages: 6,291

    Index with included columns

    LEAF LEVEL: 1,000,000 leaves, 50B/leaf ==> 160 leaves/page ==> 6,250 pages

    INTERMEDIATE LEVEL: 6,250 nodes, 20B/node ==> 400 nodes/page ==> 16 pages

    ROOT LEVEL: 16 nodes, 20B/node ==> 400 nodes/page ==> 1 page

    Total# pages: 6,267

    6,291 / 6,267 = 1.003829 ... ~ 1.004

    Clearly, there are slight differences in the final number depending on the table size but no matter what, the number will be very similar for any number of rows.

    Also, in Josef's example, the leaf node/index key was relatively small and therefore the number of intermediate nodes compared to the # leaf nodes was very small. With wider indexes, the number of intermediate nodes increases and so does the amount of saved space.

  • jshailendra

    SSCrazy

    Points: 2260

    Josef, thanks for the article...I learned about included columns..

  • RichardBo

    Ten Centuries

    Points: 1260

    What does SARG mean?

    I'll go and look it up now, but I like to think of these articles as stand alone so would like an explanation the article itself.

    Whether the maths is right or not, it is a useful tool for the toolbox. Give it try when the situation arises see if the performance increases.

  • Gail Shaw

    SSC Guru

    Points: 1004474

    RichardBo (2/11/2010)


    What does SARG mean?

    Searchable ARGument. It means a predicate in the where clause that can be used for an index seek operation. As a very high level summary, that's a predicate that directly compares a column to an expression/value. So no functions or expressions on the column side.

    That's very high level, there's a lot more to it. A google search should turn up several arguments.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • carl.anderson-1037280

    Mr or Mrs. 500

    Points: 522

    This article is in regard to indexes with included column on heap tables. This means the table doesn't have a clustered index, right? In our data warehouse, all of our indexed tables use at least one clustered index (as per DBA mandate).

    How does this article change if we're talking about a table with a clustered index?

    Also, is there a compelling reason to use heap tables?

    Thanks for the article!

    Carl Anderson

    Data Architect

    Northwestern University EDW

  • sqlrunner

    Mr or Mrs. 500

    Points: 513

    carl.anderson-1037280 (2/11/2010)


    This article is in regard to indexes with included column on heap tables. This means the table doesn't have a clustered index, right? In our data warehouse, all of our indexed tables use at least one clustered index (as per DBA mandate).

    How does this article change if we're talking about a table with a clustered index?

    Also, is there a compelling reason to use heap tables?

    Thanks for the article!

    Carl Anderson

    Data Architect

    Northwestern University EDW

    The leaf level of the non-clustered index would point, not to the heap table, but to the clustered index. The concepts still hold. You would make a covering index if your clustered did not satisfy your query. Regarding Heap Tables. Personally, I only put a clustered index on a table if there is a reason to, not just because there is a mandate. I prefer to have clustered indexes unique, although with 2005+ there is a uniquifier column just in case your index is not unique by key. If there are primary/foreign key relationships, then you have a clustered index.

    Josef Richberg
    2009 Exceptional DBA
    http://www.josef-richberg.squarespace.com
    http://twitter.com/sqlrunner

  • Oleg Netchaev

    SSCertifiable

    Points: 5272

    RichardBo (2/11/2010)


    What does SARG mean?

    The best possible prelude will be Gail's own excellent article titled Introduction to Indexes. It has 3 parts, first was published here last September and others a little later. The url to the first part is http://www.sqlservercentral.com/articles/Indexing/68439/[/url]. Gail's article includes the link to Brad Mcgehee article about SARGable predicates.

    Oleg

  • TravisDBA

    SSCoach

    Points: 15780

    Although using INCLUDE columns in an Index is not practical in all situations, using them to eliminate a Bookmark Lookup (RID) in an execution plan in a lot of situations I have generally found to be a good thing in query response times. Grant Fritchey has spoken on this in the past as well and he is an expert on Query Execution Plans IMHO.:)

    "Technology is a weird thing. It brings you great gifts with one hand, and it stabs you in the back with the other. ...:-D"

  • Fal

    SSCrazy

    Points: 2873

    carl.anderson-1037280 (2/11/2010) In our data warehouse, all of our indexed tables use at least one clustered index (as per DBA mandate).

    Carl Anderson

    Data Architect

    Northwestern University EDW

    Not sure about a mandate, as a clustered index relates to how the data is physically stored on disk, hence you can only have one. Given that, it's best to have your clustered index on column(s) that will give a benefit in that regard.

    My own preference is on column(s) that are required in common range lookups. So for unique IDENTITY() columns: No. But for datetime columns, or a "category" column, then you get the benefit of the physical order when reading pages BETWEEN <here> and <there>.

    S.

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply