Why would an identity clustered index get fragmented?

  • Hi,

    Lets say I have a table called company. In that table I have: CompanyID, CompanyName, CompanyURL, etc. columns.

    CompanyID is an Identity column which has PK and clustered index on it. If I insert lets say 300k records into the table, the clustered index should not get fragmented, because it has auto-incrementing values which are sequential. But it does.

    Any ideas what could be the reason?

    Thanks.

  • A few possibilities:-

    Deletes leaving a gap where the row was.

    Updates that change the size of the row, forcing the page to split

    Inserts into the table that rollback before being completed.

    Part of the table has been allocated in a mixed extent.

  • Are you shrinking the DB? Maintenance plan or auto_shrink? That'll fragment anything.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • I have also faced similar problem and for me there is no maintenance plan / shrink db done.

    the DB was in SQL2K5 express edition and the recovery model is Simple. I used multiple perlscript to load 1M records to a table throughout the day and finally seen frangmentation about 70% for the clustered index on identity column.

  • Is it the fill factor of the index?

    Cheers,
    Prithiviraj Kulasingham

    http://preethiviraj.blogspot.com/

  • GilaMonster (11/12/2009)


    Are you shrinking the DB? Maintenance plan or auto_shrink? That'll fragment anything.

    No, not at all.

  • Do you have BLOB data in the table?

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • Ian Scarlett (11/12/2009)


    A few possibilities:-

    Deletes leaving a gap where the row was.

    Updates that change the size of the row, forcing the page to split

    Inserts into the table that rollback before being completed.

    Part of the table has been allocated in a mixed extent.

    Ok, the first one should not cause fragmentation, it would only make index pages more "empty"

    the second one, I am not sure 100%, but the clustered index only has that identity column which does not get updated.

    Would the data updates affect clustered index pages?

    the third one is quite plausible. will need to check it.

    as for the fourth, the size of the row is under 1000 bytes, so this should not happen.

  • CirquedeSQLeil (11/12/2009)


    Do you have BLOB data in the table?

    No, this is the script for the table:

    CREATE TABLE [dbo].[Company](

    [CompanyID] [int] IDENTITY(1,1) NOT NULL,

    [Col2] [nvarchar](100) NOT NULL,

    [Col3] [nvarchar](100) NOT NULL,

    [Col4] [nvarchar](150) NULL,

    [Col5] [int] NULL,

    [Col6] [datetime] NULL,

    [Col7] [uniqueidentifier] NULL,

    [Col8] [nvarchar](15) NULL,

    [Col9] [int] NULL,

    [Col10] [nvarchar](50) NULL,

    [Col11] [tinyint] NOT NULL,

    [Col12] [datetime] NOT NULL,

    CONSTRAINT [PK_Company] PRIMARY KEY CLUSTERED

    (

    [CompanyID] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 95) ON [PRIMARY]

    ) ON [PRIMARY]

    GO

  • There's no guarantee that you're inserting into contiguous extents or pages.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (11/12/2009)


    There's no guarantee that you're inserting into contiguous extents or pages.

    If I am NOT using "set identity insert on", why this would happen?

  • Roust_m (11/13/2009)


    Jeff Moden (11/12/2009)


    There's no guarantee that you're inserting into contiguous extents or pages.

    If I am NOT using "set identity insert on", why this would happen?

    It could be filling in blank extents between other tables. There is no guarantee that each extent in a table will be next to the other. There are other things on the harddisk. How fragged is the actual harddisk? Also, did the database grow by percent over time?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Think of it this way.... you may have tables occupying extents like the following (each letter is a table)...

    ABADADAAFFAJAKABAAADAB

    If the table you're looking at is "A", then there's some fragmentation even if all the parts are in perfect order.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Hmmm... maybe you're not talking about that type of fragmentation. Even if the "A"s were scattered apart like that, if they were all in perfect order, the logical fragmentation would be close to or at 0.

    I've not looked back in this post to see if you've done so, but if you've modified data in VARCHAR columns to be larger than the original, you could end up with page splits (default fill factor for SQL Server is "0" which is very similar to "100").

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Roust_m (11/12/2009)


    the second one, I am not sure 100%, but the clustered index only has that identity column which does not get updated.

    Would the data updates affect clustered index pages?

    Absolutely. The clustered index has the entire data row at the leaf levels. At the non-leaf levels it has only the identity column. If an update to any column grows the row and there's not enough space on the page, the page will split causing fragmentation.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 15 posts - 1 through 15 (of 19 total)

You must be logged in to reply to this topic. Login to reply