Introduction to Indexes: Part 3 – The nonclustered index

  • Hi Gail,

    It was really a nice series to learn about the indexes. However, I have one confusion here, so you may guide me :hehe:

    Unlike the clustered index nonclustered indexes does not contain the entire data row at the leaf level. Rather, the nonclustered index contains just the columns defined in the index, and a pointer to the actual data row.

    Yes, understood.

    Columns specified as include columns are stored at the leaf level of the nonclustered index, but not at the intermediate or root levels.

    Here from the above statement I wanted to know that how this included columns data is stored in the leaf level. Is it also stored as the index keys are stored ? i.e.

    Included Column value ----> Pointer to actual data row

    ??


    Sujeet Singh

  • Divine Flame (11/29/2011)


    Here from the above statement I wanted to know that how this included columns data is stored in the leaf level.

    Same as the index key values are. Part of the row in the leaf pages. That's the definition of an include column - one that's in the index leaf pages only, not the intermediate and root levels.

    Included Column value ----> Pointer to actual data row??

    Errr... The pointer to the actual data row is either the clustered index key or the RID if there's no clustered index on the table.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • As with most everyone else, just wanted to say thanks for all the wonderfulness. Also, you are my hero and when [if] I grow up I want to be just like you. 😀

    [font="Courier New"]Looking for a Deadlock Victim Support Group..[/font]
  • GilaMonster (12/8/2011)


    Same as the index key values are. Part of the row in the leaf pages. That's the definition of an include column - one that's in the index leaf pages only, not the intermediate and root levels.

    Thanks a lot Gail that you spared some time to answer my question. I got it now.

    Divine Flame


    Included Column value ----> Pointer to actual data row??

    GilaMonster (12/8/2011)


    Errr... The pointer to the actual data row is either the clustered index key or the RID if there's no clustered index on the table.

    I understand this one correctly, but I think I wrote it here in a wrong way. I just wanted to stimulate the structure of an index like we have on the back of the books :-). I didn't mean to say that Included column are the pointers to the actual row :-D.

    Your article was written so well that after reading them, I was always clear that pointers are either RID or clustered index key :-).


    Sujeet Singh

  • Hi Gail, another thanks for the series!

    We've always used the 'rule' that every table needs a clustered index and I've never really challenged that. If the PK is lean enough, we use that, otherwise we put an IDENTITY(int) column to the table and take that... a simple no-brainer in most cases.

    However... suppose we have a large table (wide & long) that gets INSERTs and UPDATEs on non-indexed fields only.

    Would it make sense to 'force' it as a heap and use non-clustered indexes only because I (naively?) would expect that the Bookmark lookup (**) would be A LOT faster using the RID than using the clustered-index key(s); the latter requiring zigzagging through the entire (clustered) b-tree again ... ??

    I agree that if queries attack the table using the clustered index keys directly the lack of a bookmark lookup makes them 'winners'. But in our case we have at least 2 other totally different approaches in accessing the same table and then those need to go via a rather expensive bookmark lookup now... making it a heap would make every approach more or less equally good/bad...

    The only reason NOT to work like this would be that whenever a table-change affects a record to move from one page to another this would affect all the leaf-nodes of every index and it would turn out expensive to update all relevant leaf-nodes to the new RID. Then again, since we only add to the heap, such a situation would pretty much never happen and I see no really reason to (frequently) rebuilding the heap either...

    I might be missing something fundamental here but plan on giving it some testing the next days anyway... fascinating stuff =P

    Roby

    PS: another issue might be that I've noticed in the past that INSERTS on a heap seem to be slower than those on e table with the clustered index on an Identity() table... this might not be relevant any more on recent versions of the product. I never really understood why this was the case anyway...

    (**: which as I understand it is the part between 'finding the RID/clustered-index field(s) from the non-clustered index' and the actual fetch of the (non-indexed) fields from the table)

  • deroby (7/6/2012)


    Hi Gail, another thanks for the series!

    We've always used the 'rule' that every table needs a clustered index and I've never really challenged that. If the PK is lean enough, we use that, otherwise we put an IDENTITY(int) column to the table and take that... a simple no-brainer in most cases.

    However... suppose we have a large table (wide & long) that gets INSERTs and UPDATEs on non-indexed fields only.

    Would it make sense to 'force' it as a heap and use non-clustered indexes only because I (naively?) would expect that the Bookmark lookup (**) would be A LOT faster using the RID than using the clustered-index key(s); the latter requiring zigzagging through the entire (clustered) b-tree again ... ??

    Well, if your UPDATEs sometimes make, say, VAR* columns larger, then you're likely to see table fragmentation, which is best fixed by normal index maintenance on the clustered index.

    I would say that the ability to do index maintenance is another very solid argument for "everything gets a clustered index".

  • deroby (7/6/2012)


    However... suppose we have a large table (wide & long) that gets INSERTs and UPDATEs on non-indexed fields only.

    Would it make sense to 'force' it as a heap and use non-clustered indexes only because I (naively?) would expect that the Bookmark lookup (**) would be A LOT faster using the RID than using the clustered-index key(s); the latter requiring zigzagging through the entire (clustered) b-tree again ... ??

    Most likely no. SQL is optimised for tables having clustered indexes. While a RID lookup will read less pages than a key lookup, the lookup is still a horridly slow operation that you want to avoid most of the time.

    I agree that if queries attack the table using the clustered index keys directly the lack of a bookmark lookup makes them 'winners'. But in our case we have at least 2 other totally different approaches in accessing the same table and then those need to go via a rather expensive bookmark lookup now... making it a heap would make every approach more or less equally good/bad...

    Why does two approaches automatically mean lookups?

    Even if it did, making the table a heap would probably make both equally bad, which is probably not the optimisation strategy I would favour.

    The only reason NOT to work like this would be that whenever a table-change affects a record to move from one page to another this would affect all the leaf-nodes of every index ...

    When rows are moved in a heap (typically when they are updated and grow in size) they leave forwarding pointers behind, the nonclustered indexes aren't updated because that would be expensive.

    Forwarding pointers make reads of the heap more expensive than it would be othewise.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Hi Gail

    I want to thank you for your effort on indexes.

    I'll use this opportunity to ask a question.

    Example: There is a Table with one clustered index and some non-clustered. It is connected to more other tables.

    Then I create non-clustered index(es) on the table, in order to execute some queries faster, and then i want to drop the non-clustered index(es), because some other actions to that table may change.

    Is it a reasonable practice?, won't it change something? Do i need to rebuild the previously existed indexes?, or it is not needed.

    Thanks

    IgorMi

    Igor Micev,My blog: www.igormicev.com

  • I don't understand what you're asking.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • IgorMi (7/20/2012)


    Hi Gail

    I want to thank you for your effort on indexes.

    I'll use this opportunity to ask a question.

    Example: There is a Table with one clustered index and some non-clustered. It is connected to more other tables.

    Then I create non-clustered index(es) on the table, in order to execute some queries faster, and then i want to drop the non-clustered index(es), because some other actions to that table may change.

    Is it a reasonable practice?, won't it change something? Do i need to rebuild the previously existed indexes?, or it is not needed.

    Thanks

    IgorMi

    I think you want to know that whether deleting few indexes on a table can make other indexes (on the same table) fragmented?

    If this was your question, i think the answer is NO. Deleting any non-clustered index on a table doesn't increase the fragmentation of other indexes.


    Sujeet Singh

  • GilaMonster (7/20/2012)


    I don't understand what you're asking.

    My question is the following:

    there is an existing table with lets say 5 indexes(1 clustered + 4 nonclustered).

    The Estimated Execution Plan of SSMS, for a particular stored procedure or query, suggests to add 3 more nonclustered indexes (for the same table), and I add them just for my SP to execute faster.

    After that I don't need the last 3 indexes. The questions is: Is it enough just to drop those 3 indexes, or i have to drop them and additionally rebuild the old (the 5 indexes) indexes?

    Regards

    IgorMI

    Igor Micev,My blog: www.igormicev.com

  • Don't do that. Waste of time. It is highly, exceedingly unlikely that it's faster to create all the indexes then run the query then drop the indexes than to just run the query. Plus creating and dropping indexes will cause plan recompilation, extensive locks, etc.

    Do your analysis, if you need the indexes, create them permanently. If you don't, don't create them.

    The only place you typically create and drop indexes frequently is data warehouses. Drop/disable indexes before the data load, recreate/rebuild afterwards.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (7/20/2012)


    Don't do that. Waste of time. It is highly, exceedingly unlikely that it's faster to create all the indexes then run the query then drop the indexes than to just run the query. Plus creating and dropping indexes will cause plan recompilation, extensive locks, etc.

    Do your analysis, if you need the indexes, create them permanently. If you don't, don't create them.

    The only place you typically create and drop indexes frequently is data warehouses. Drop/disable indexes before the data load, recreate/rebuild afterwards.

    Thanks for this!

    One of my last tasks was to make a SP execute faster. I run the Estimated Execution Plan of SSMS and it suggested creating 3 new indexes. The tables were quite big (about 270 million records). The SP was lasting 'infinite' and after creation of indexes it finished for an 1.5h. But additionally I replaced the passing arguments of the nested functions and SPs (called from the main SP) with local variables and then use them through the code, so my doubt here is whether i reached the goal with the new indexes or with the replacement of the passing arguments of all the nested SPs and FNs. I read somewhere in BOL for the passing arguments to be replaced with local variables when nesting.

    I haven't dropped the indexes yet. On one table they increased up to 8 indexes. Because the server is for replications, i decided to do that.

    What is your suggestion in this situation? Is it smart to tell the admins to disable those indexes when loading new data? if they never disable them. I'm not sure whether they disable them whenever they load data. The admins are not on my local site, that is why i haven't asked them yet, but I would.

    Thanks in advance

    IgorMi

    Igor Micev,My blog: www.igormicev.com

  • Hard to say without detailed analysis.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Calypso Bay Jacobs Well Land Estate Reviews - Roche Group - Luxury Real Estate located between Brisbane and the Gold Coast in Queensland,

    Australia.

    calypso bay[/url]

    calypso bay[/url]

Viewing 15 posts - 76 through 90 (of 92 total)

You must be logged in to reply to this topic. Login to reply