Clustered Index Internals

  • I agree that is not wrong, but is false as a general statement. It is out of context.

    It is true where you create/rebuilt an index, but false when you insert/update/delete on a table. When you insert/update/delete the order for the existing data is not modified on the disk. The order in this case is mantained logically.

  • Actually I'm not sure it's wrong. When you insert/update data you can cause a page split. In that case, the data is rearranged on disk.

  • If you cause a page split. But if not? It is still false as a general statement.

  • I am very interested in how the rebuilding of a clustered index would result in data being moved to another partition. The partition of a particular record in a table is determined through a calculation based on data in the record itself. Unless either the data or the partition function are changed, than a given record will always be placed into the same partition that it was previously placed in. If this is an accurate statement, than an index rebuild should NEVER result in data being moved into a different partition.

    Also, the doubly linked list would imply that there are pointers on every page to the previous and next pages in the table. I have NEVER read anywhere in any documentation that this exists. As far as I knew, the IAM pages contained the bit maps providing the allocation of pages to a table. Can anyone point me to documentation that describes this linked list?

  • There have been a few concerns regarding the moving of data to a different drive if you rebuild an index in a partition scheme.

    I want to provide a scenario to support the "may" in this option.

    When partitioning a table, you create a filegroup for each partition. Once created files may be created for each filegroup. It may be necessary for each file to be assigned to a different drive. Let's say the table is 300GB and spread across 2 partitions and filegroups. Each of these filegroups has 10 files spread across 10 drives - that means each file is roughly 15GB, if we have partitioned the data down the middle (it's easier for this scenario).

    Each partition data will remain in that partition when a rebuild occurs, however, due to the rebuild we have a scenario where the data could now potentially move from one file to another in the filegroup and thus be on a different drive.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • Dropping and recreating a Clustered Index can definitely be used to move data to another disk. This is one of several documented/recommended ways to move data of certain tables to another disk in case the original disk got too small or too slow. So the statement itself is correct, but sort of incomplete.

    Incomplete, because it requires additional actions : the creation of a new Filegroup on a different disk and the "ON <new-filegroup>" clause in the CREATE INDEX statement.

    Excerpt from the SQL 2005 Books Online :

    Because the leaf level of a clustered index and the data pages are the same by definition, creating a clustered index and using the ON partition_scheme_name or ON filegroup_name clause effectively moves a table from the filegroup on which the table was created to the new partition scheme or filegroup. Before creating tables or indexes on specific filegroups, verify which filegroups are available and that they have enough empty space for the index. For more information, see Determining Index Disk Space Requirements.

    However - I sense some sort of inconsistency here.

    I repeatedly read that "data is *copied* to the leaf pages of the Clustered Index". That would contradict the phrase that it "effectively moves a table", because the original Table would still exist with its original data pages.

    So - is the data from the original Table physically *copied* or *moved* to the Clustered Index leaf pages ?

    If they are copied - what happens to the original Table ? Does it still occupy the diskspace and add to the database size but is not used anymore ?

    I *think* to remember that I had done this procedure once, 3-4 years ago on a SQL 2000 server, to move the Index workload to another disk.

    And as far as I remember I was able to shrink the original MDF file significantly which would confirm that the data are actually moved.

    So the table definition still remains in its original filegroup, but the "leaf level" with the data content is actually saved in the Clustered Index on the other filegroup/disk ?

    And are table data modifications done directly on these moved data pages on the other filegroup/disk within the Clustered Index ?

    Or on the original filegroup/disk and only moved physically to the Index filegroup/disk when dropping/recreating the Index ?

    Very confusing.....

  • It have the book and I agree it is a great book, but the answers to the question were a little misleading I thought.

  • Rich-403221 (12/9/2009)


    Also, the doubly linked list would imply that there are pointers on every page to the previous and next pages in the table. I have NEVER read anywhere in any documentation that this exists. As far as I knew, the IAM pages contained the bit maps providing the allocation of pages to a table. Can anyone point me to documentation that describes this linked list?

    It's actually been documented since SQLServer was first released by Sybase, but a lot of the Microsoft Documentation is pretty poor and badly arranged. Look in BOL under "Clustered Index Structures" http://msdn.microsoft.com/en-us/library/ms177443.aspx

    Cheers

    Roddy

  • "doubly linked lists".

    Well, i learn something today 🙂

    I thought it was B-Tree.

    I don't like questions where all answers are right, it don't learn us the limits, it makes me confusing, and there are some in these QoD. :/


    My MCP Transcript (ID : 692471 Access : 109741229)

  • My page 311 paragaph 1 says something different than your page 311 paragraph 1.

    Mine says: "...This order is maintained through a doubly linked list called a page chain. (Note that pages in a heap are not linked in any way to each other.) The order of pages in the page chain, and the order of rows on the data pages, is based on the definition of the clustered index. Deciding on which column(s) to cluster is an important performance consideration." (End of Paragraph)

    This part isn't in my book.

    "When the clustered index is created, it does in fact reorder the data physically on disk according to the cluster key, or as close to physical order as possible given the available space. After that, maintaining the clustered index is a logical operation. Rebuilding the index will again order the data physically. Reorganizing the index is an attempt to make the logical and the physical order match, without all the work of rebuilding the index...."

    That being said, I don't understand why everyone is confused.

    A clustered index does physically rearrange data on a disk when it is created and can continue to do so if there are page splits or when the index is rebuilt. There's no ambiguity here.

    A clustered index logically tracks data on disk using doubly linked lists. This has been true for a long time.

    Dropping and rebuilding a clustered index on a partition scheme may move data to other drives. The whole point of using a partition scheme is to distribute the data across multiple disks to improve performance by distributing the I/O as well as by creating logically and physically smaller partitions to search. Even if you don't use partitions, the best way to copy a table from one filegroup to another is to drop the clustered index and re-create it using a different file group. That has also been true for quite awhile.

    I find it particularly amusing that some of you have the chutzpah to claim that Kalen Delaney is wrong. 🙂

    "Beliefs" get in the way of learning.

  • Re: A clustered index physically rearanges data on disk.

    Technically, the use of the word "disk" is wrong. It rearranges data within the (SQL Server) pages and extents. Where these pages are placed on the physical disk (track and cylinder) is not within the control of SQL Server. It is the OS (Windows) and/or the physical hardware implementation (e.g., a SAN) that determines where the data is actually placed on the physical disk.

    The only item that I know of that "physically rearanges data on disk" is a disk defragmenter.


    [font="Arial Narrow"](PHB) I think we should build an SQL database. (Dilbert) What color do you want that database? (PHB) I think mauve has the most RAM.[/font]

  • Mauve (12/10/2009)


    Re: A clustered index physically rearanges data on disk.

    Technically, the use of the word "disk" is wrong. It rearranges data within the (SQL Server) pages and extents. Where these pages are placed on the physical disk (track and cylinder) is not within the control of SQL Server. It is the OS (Windows) and/or the physical hardware implementation (e.g., a SAN) that determines where the data is actually placed on the physical disk.

    The only item that I know of that "physically rearanges data on disk" is a disk defragmenter.

    That's a semantic argument that isn't even accurate. Those pages and extents exist in files on disk and the disk controller must access them. If I rebuild the clustered index then the pages (and their contents) that were logically connected via page chains will be re-arranged (on the disk), i.e. an individual row may move from one physical disk sector of a file extent to another. If I insert a million new rows and my file group needs to extend then that additional disk space must be allocated. I'll grant you that the file system created the extent but it is SQL Server that is creating the physical pages that will be stored on the extent.

    I suppose you could say that "technically" the only thing that writes to a disk are the read-write heads which are managed by the firmware of the device but that kind of defeats the purpose of the discussion. We don't need to be concerned with that level of abstraction. It's a lot simpler to say that SQL Server physically rearranges data on disk.

    "Beliefs" get in the way of learning.

  • Robert Frasca (12/10/2009)


    I find it particularly amusing that some of you have the chutzpah to claim that Kalen Delaney is wrong. 🙂

    Far be it for me to question Kalen !!!! I just had never heard of the linked lists and still am not totally comfortable with the partitioning part. I still think that the partitioning function will put data right back where it had originally been ....

  • Robert is right, the quote is not quite the same as in the book. The following email will explain the difference 😉

    --**********************************************************************************

    Hi Brandon

    Of course you can publish this note. I would suggest a minor edit to your last sentence:

    When the clustered index is created, it does in fact reorder the data physically on disk according to the cluster key, or as close to physical order as possible given the available space. After that, maintaining the clustered index is a logical operation. Rebuilding the index will again order the data physically. Reorganizing the index is an attempt to make the logical and the physical order match, without all the work of rebuilding the index.

    Best regards,

    Kalen Delaney, SQL Server MVP

    http://www.SQLServerInternals.com

  • Robert Frasca (12/10/2009)


    Dropping and rebuilding a clustered index on a partition scheme may move data to other drives. The whole point of using a partition scheme is to distribute the data across multiple disks to improve performance by distributing the I/O as well as by creating logically and physically smaller partitions to search. Even if you don't use partitions, the best way to copy a table from one filegroup to another is to drop the clustered index and re-create it using a different file group. That has also been true for quite awhile.

    No its not. The best way is to create with drop existing - not dropping and creating a new one. Many tools on the market take this approach - with the typical effect that they need to drop and recreate every foreign key that references the primary key that you are dropping and creating. Try that on a table with 30 rows but referenced by one with 4bn...

Viewing 15 posts - 16 through 30 (of 40 total)

You must be logged in to reply to this topic. Login to reply