Stairway to SQL Server Indexes: Step 11, Index Fragmentation

  • Comments posted to this topic are about the item Stairway to SQL Server Indexes: Step 11, Index Fragmentation

  • In an example #2 to accomodate one month of inserts into 12 months table, how do we figure out that fillfactor might be 80-84%? We insert 100/13= 8% of the data, so we would reserve 8% of the space.

  • No offense intended, but when I refer to "external fragmentation", I refer to fragmentation outside of SQL Server's control; even if SQL Server reports page 1's next page is 2, whose next page is 3, at the OS level, perhaps they're on three completely different areas of the disk. Perhaps even the OS level is contiguous, but the SAN (Meta)LUN actually has each on noncontiguous areas of the platters.

    In either case, the head has to move more than it should to read three pages.

  • What would a treshold (in the number of index pages), below which we should not worry about fragmentation and don't have to rebuild?

  • ?

  • I wish people would stop perpetuating the phrases 'external fragmentation' and 'internal fragmentation' and use the phrases that the SQL team, the tools, and Books Online uses: logical fragmentation and low page density.

    None of the tools report 'external' or 'internal' fragmentation and haven't done since at last SQL Server 7.0/2000 with my old DBCC SHOWCONTIG.

    /rant.

    Paul Randal
    CEO, SQLskills.com: Check out SQLskills online training!
    Blog:www.SQLskills.com/blogs/paul Twitter: @PaulRandal
    SQL MVP, Microsoft RD, Contributing Editor of TechNet Magazine
    Author of DBCC CHECKDB/repair (and other Storage Engine) code of SQL Server 2005

  • Paul Randal (11/26/2013)


    I wish people would stop perpetuating the phrases 'external fragmentation' and 'internal fragmentation' and use the phrases that the SQL team, the tools, and Books Online uses: logical fragmentation and low page density.

    None of the tools report 'external' or 'internal' fragmentation and haven't done since at last SQL Server 7.0/2000 with my old DBCC SHOWCONTIG.

    /rant.

    I completely agree that "internal" and "external" fragmentation are useless phrases, particularly when you also start talking at a filesystem level or beyond. Perhaps we can come up with a descriptive list of various kinds of fragmentation, and then assign short names?

    I'll start, and include some perhaps theoretical types. Paul, if you think this is of any value, I would more than welcome your corrections and input - I'm leaving the SQL types at the very bottom fairly empty of detail, for real experts to fill in.

    ****** The first set are rare, interesting, and/or often hidden types requiring less common techniques or tools to detect and/or correct - skip to the next set of asterisks for the more common types ******

    "short name" - Block level fragmentation - a logical storage object comprised of noncontiguous physical blocks that could be more contiguous than they are. SSD's almost always have this. Spindle disks that have done bad sector remapping may have this.

    "short name" - SAN/Host MetaLUN/virtual disk level fragmentation - a MetaLUN/virtual disk comprised of noncontiguous LUN/virtual disks that could be more contiguous than they are.

    "short name" - SAN/Host LUN/virtual disk level fragmentation - a LUN/virtual disk comprised of sections of spindle/SSD space that could be more contiguous than they are. Also includes JBOD (Just a Bunch of Disks) where a partition could be more contiguous than it is. Thin provisioned virtual disks usually end up with this (for Hyper-V, the guest sees it as SAN/Host level, while the host sees it as Filesystem level fragmentation, unless you have nested hosts, in which case Google "recursion")

    "short name" - Partition level fragmentation - a partition comprised of sections of LUN/spindle/SSD space that could be more contiguous than they are. Windows Dynamic disk volumes can have this level, for instance if they end up with noncontiguous or out of order extents (i.e. extents 1, 2, 3, 4, 5 and D: has extents 3,1,4 while E: has extents 2, 5)

    ****** Below this line are common types a systems DBA deals with ******

    "short name" - Filesystem drive letter/mount point level fragmentation - A drive letter/mount point containing of multiple files that get accessed at overlapping times. For instance, a SQL Server filegroup with more than one file per drive letter/mount point, such that when SQL accesses the filegroup, reads and writes must skip back and forth between files - thus, if the drive letter/mount point represents, say, a single spindle, the drive head has to move from File A to File B and back to File A during the filegroup access. This is also caused by putting both the data and the log files of a READ_WRITE database on the same drive letter/mount point. This is reduced by having one file per drive letter/mount point that is accessed at a time.

    "short name" - Filesystem File level fragmentation - one or more files comprised of sectors that could be more contiguous than they are. Typically can be reduced by filesystem level defragmentation tools (Piriform Defraggler, Windows Defrag, etc.) or backing up the file(s), deleting as much as possible including the file(s) affected, then recopying the file(s) back, single-threaded, with a tool that pre-allocated space as much as possible for the filesystem involved. SQL Server data and log files which grow are often subject to this, as they are allocated space from effectively arbitrary locations on the filesystem.

    "SQL Server low page density" - pages are could be more full than they are. Typically can be reduced by index REBUILD (where it is controlled by FILLFACTOR and PAD_INDEX settings).

    "SQL Server logical fragmentation" - pages and/or extents are out of order. Reduced by index REBUILD or REORGANIZE.

  • I didn't quite get the external fragmentation for random inserts pattern. How did it actually happen in example that after 50000 random inserts all 128 pages are out-of- order?

    128 pages for 50k rows mean that we have ~390 index entries per page at leaf lvl.

    So, say we did first 390 random inserts, then we got page split and SQL Server found an empty page somewhere ahead of the 1st initial page. So, it moved half of the data forward. Next-page pointer of the 1st page points to this 2nd page, so, these pages are ordered now, and so on, and so on.

    Or the trick is that index entries are not actually sequenced within a page (sequence is achieved through page offset pointers) - so, the half of the index that is moved contains unsequenced entries and that's why all these values are spread across 128 pages and so that's why the next-page pointers get out of order?

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply