Possible sources of fragmentation on Clustered Index

  • I have a table that has a clustered index that is only the identity column on the table. The table is somewhere around 200K rows and has 3800 pages in the index. We run our index maintenance every other day on this database using Ola's scripts and this index is rebuilt because it is 40-60% fragmented after 2 days. Overall, this isn't really too much of a problem since the index rebuild doesn't take too long, but I am puzzled as to how this index is getting fragmented since the only column in it is the identity. Here's how the table is structured (the names were changed):

    CREATE TABLE [dbo].[Example](

    [ExampleID] [int] IDENTITY(1,1) NOT NULL,

    [ExampleCode] [varchar](10) NOT NULL,

    [ForeignID] [int] NOT NULL,

    [AnotherID] [int] NOT NULL,

    [ScheduleCode] [int] NOT NULL,

    [StartDate] [datetime] NOT NULL,

    [EndDate] [datetime] NOT NULL,

    [PersonID] [int] NOT NULL,

    [ExampleDescription] [varchar](1000) NOT NULL,

    [CreatedDate] [datetime] NOT NULL,

    [CreatedByID] [int] NOT NULL,

    [ModifiedDate] [datetime] NOT NULL,

    [ModifiedByID] [int] NOT NULL,

    [ExampleFlag] [bit] NOT NULL,

    [OtherFlag] [bit] NOT NULL,

    [YetAnotherFlag] [bit] NOT NULL,

    [ExampleDate] [datetime] NULL,

    [LastDate] [datetime] NOT NULL,

    [FinalID] [int] NOT NULL,

    CONSTRAINT [PK_Example] PRIMARY KEY CLUSTERED

    (

    [ExampleID] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 98) ON [PRIMARY]

    ) ON [PRIMARY]

    There is nothing strange like updates to the identity happening and while some records are deleted, there has only been about 20,000 in the life of the table (months). Not enough to account for the level of fragmentation that we're seeing on the index.

    About the only thing I can think of that would cause fragmentation on this index in this scenario are:

    1. Page splits caused by starting with a small value in one of the VARCHARs and later inserting a larger value

    2. Page splits caused by the NULLABLE column, ExampleDate, starting with NULLs and later updating them to a date.

    For #1, I had development check the update scenarios for the varchar columns, especially the varchar(1000) one, and they didn't see it as a common thing where the values would go from small (or empty) to large.

    For #2, I checked and found that the only value for that column in the table is NULL so while it always starts as NULL, it never gets updated to anything else.

    I've tried looking at sys.dm_db_index_operational_stats and the leaf_update_count is around 300,000, but unless those updates are causing page splits, I don't see how they would contribute to fragmentation.

    What other scenarios would cause this fragmentation? What else can I look look at to get to the bottom of this?

  • The only reasonable explanation seems to be updates. Set up a tight trace to disk filtering on just the name of the table maybe to catch what is going on?

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • ErikMN (1/23/2015)


    I have a table that has a clustered index that is only the identity column on the table. The table is somewhere around 200K rows and has 3800 pages in the index. We run our index maintenance every other day on this database using Ola's scripts and this index is rebuilt because it is 40-60% fragmented after 2 days. Overall, this isn't really too much of a problem since the index rebuild doesn't take too long, but I am puzzled as to how this index is getting fragmented since the only column in it is the identity. Here's how the table is structured (the names were changed):

    CREATE TABLE [dbo].[Example](

    [ExampleID] [int] IDENTITY(1,1) NOT NULL,

    [ExampleCode] [varchar](10) NOT NULL,

    [ForeignID] [int] NOT NULL,

    [AnotherID] [int] NOT NULL,

    [ScheduleCode] [int] NOT NULL,

    [StartDate] [datetime] NOT NULL,

    [EndDate] [datetime] NOT NULL,

    [PersonID] [int] NOT NULL,

    [ExampleDescription] [varchar](1000) NOT NULL,

    [CreatedDate] [datetime] NOT NULL,

    [CreatedByID] [int] NOT NULL,

    [ModifiedDate] [datetime] NOT NULL,

    [ModifiedByID] [int] NOT NULL,

    [ExampleFlag] [bit] NOT NULL,

    [OtherFlag] [bit] NOT NULL,

    [YetAnotherFlag] [bit] NOT NULL,

    [ExampleDate] [datetime] NULL,

    [LastDate] [datetime] NOT NULL,

    [FinalID] [int] NOT NULL,

    CONSTRAINT [PK_Example] PRIMARY KEY CLUSTERED

    (

    [ExampleID] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 98) ON [PRIMARY]

    ) ON [PRIMARY]

    There is nothing strange like updates to the identity happening and while some records are deleted, there has only been about 20,000 in the life of the table (months). Not enough to account for the level of fragmentation that we're seeing on the index.

    About the only thing I can think of that would cause fragmentation on this index in this scenario are:

    1. Page splits caused by starting with a small value in one of the VARCHARs and later inserting a larger value

    2. Page splits caused by the NULLABLE column, ExampleDate, starting with NULLs and later updating them to a date.

    For #1, I had development check the update scenarios for the varchar columns, especially the varchar(1000) one, and they didn't see it as a common thing where the values would go from small (or empty) to large.

    For #2, I checked and found that the only value for that column in the table is NULL so while it always starts as NULL, it never gets updated to anything else.

    I've tried looking at sys.dm_db_index_operational_stats and the leaf_update_count is around 300,000, but unless those updates are causing page splits, I don't see how they would contribute to fragmentation.

    What other scenarios would cause this fragmentation? What else can I look look at to get to the bottom of this?

    The only two things I can think of are what you already mentioned or if someone is doing a large batch insert and then going back to update the ExampleCode column (as you said, ExampleDescription is always left NULL). It could also be that someone is shrinking the database every night but I'd expect 99% fragmentation in that case.

    You can find out more. Create a trigger on the table to capture into a table who is doing what. I suppose you could temporarily use CDC for this, as well.

    And, yeah... I'd take the time to figure this out on this starting-to-get-large table. I'd also take the time to change the FILL FACTOR from 98 to 100 on this one. It's not going to help at 98 and you're just wasting 2% of the space and slowing everything down a bit by having to read dead space.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Maybe a simple question, but after the rebuild, is the fragmentation low or is it unchanged?

  • where I've seen this before is a sneaky maintenance plan where the DB is shrunk making redundant the lovely reindexing work carried out before the shrink.

    What could be worth doing is keeping an eye on the DMV sys.dm_db_index_physical_stats - add a step post re-index to dump out the table defrag % level to a table, and if possible run this against the tables in the DB throughout the day every 2 hours to identify what time things go bang, setting the mode to the most detailed you can get away with..

    -- edited to tidy up spellings

  • Steve Jones - SSC Editor (1/25/2015)


    Maybe a simple question, but after the rebuild, is the fragmentation low or is it unchanged?

    I had set up a job previously that captures the fragmentation every 15 minutes on the index and verified that the fragmentation is indeed going back down to 0 when the index maintenance runs and that the fragmentation is going up gradually throughout the day rather than jumping up all at once.

  • Also, I verified that there is no shrinking occurring on the database. Like Jeff said, if shrinking was happening, the fragmentation would go way up and it would happen all at once. I guess I'll try to get approval to put a trigger on the table to try and get more info to narrow down the culprit. Since we're running Standard edition CDC isn't an option.

  • Profiler will be better mechanism to capture the tsql that is causing DML on the table I think. See my earlier post.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • TheSQLGuru (1/26/2015)


    Profiler will be better mechanism to capture the tsql that is causing DML on the table I think. See my earlier post.

    Profiler or Extended Events? We're running SQL 2012 SP2 so I'm guessing EE is the way to go. That being said, I've never set up EE for DML on a specific table. Which events do you capture and how do you go about filtering to just DML on the table?

    Just go off of the text by using batch_text in sql_batch_completed? Or is there a better way?

  • Books Online is your friend. start with sp_trace_create, it links to the others. My profiler trace is text, cpu, duration, reads, writes, rowcount, starttime, databaseid, spid. That's all you need for tuning.

    ObjectID is probably a filter you want, but also use text filter for table name. I would search source code for table name too and look for those objects maybe.

    There are a number of important things you still can't do with XEs when it comes to tuning, with aggregate analysis being the most important one. Qure from DBSophic ROCKS!

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • We have been looking for key changes as a cause for index changes. What if there are no key changes at all? I'm betting that there are no key changes taking place. To prove that how about setting up a deny permission for UPDATE and DELETE on that column for every user on the database (even you). I think that you will not find any violations.

    So if we are not looking at key changes then what is going on. I suspect that simple data changes are the cause. When you expand data in a row that row might no fit back onto the original page. That row would have to be put into a new page. I think that the engine is moving several rows on the same page to a new page. That will force at least one key to be promoted up the tree and one key to be updated in place. Sooner or later with tight fill factors you will get promotions that split right up to the root page. It has to happen.

    You will remember that all indexes in SQL Server are B-tree. With a clustered index the data pages are the leaf level of that index. Well data changes that can't be done in place would have to cause leaf level page splits. It's just a matter of time before that kind of splitting works its affect higher up the tree.

    If all of your columns were fixed width then all of the updates could be done in place and would be right fast.

    ATBCharles Kincaid

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply