Index ReOrg & fragmentation

  • We are looking to fine-tune our ETL process for our Data Warehouse and some of the articles we have found deal with index reorg and improvements this can make. (http://technet.microsoft.com/en-us/library/ms189858.aspx) They have stated that if the fragmentation is > 30% rebuild the index but if between 5-30% reorg the index.

    With this in mind we were thinking of checking the fragmentation and then have a decision point to determine whether we would reorg or rebuild. We build a quick little test to prove that this would work below:

    --CREATE TABLE

    CREATE TABLE DECIPHER_DATA (COL1 INT, COL2 NVARCHAR(500));

    --INSERT 500 ROWS

    SET NOCOUNT ON

    GO

    DECLARE @I INT

    SET @I = 1

    WHILE (@I <= 500)

    BEGIN

    INSERT INTO dbo.DECIPHER_DATA (COL1, COL2) VALUES (@I, REPLICATE('A', @I))

    SET @I = @I + 1

    END

    --CREATE INDEX

    CREATE UNIQUE INDEX PK_DECIPHER_DATA ON DECIPHER_DATA (COL1);

    UPDATE STATISTICS DECIPHER_DATA WITH FULLSCAN, ALL;

    --DELETE 250 ROWS

    DELETE FROM DECIPHER_DATA WHERE COL1 <= 250;

    UPDATE STATISTICS DECIPHER_DATA WITH FULLSCAN, ALL;

    --Find Fragmentation

    SELECT a.index_id, name, avg_fragmentation_in_percent

    FROM sys.dm_db_index_physical_stats (DB_ID(), OBJECT_ID(N'DECIPHER_DATA'),

    NULL, NULL, NULL) AS a

    JOIN sys.indexes AS b ON a.object_id = b.object_id AND a.index_id = b.index_id;

    --ReOrg

    ALTER INDEX [PK_DECIPHER_DATA] ON [dbo].[DECIPHER_DATA] REORGANIZE WITH ( LOB_COMPACTION = ON )

    --Find Fragmentation

    SELECT a.index_id, name, avg_fragmentation_in_percent

    FROM sys.dm_db_index_physical_stats (DB_ID(), OBJECT_ID(N'DECIPHER_DATA'),

    NULL, NULL, NULL) AS a

    JOIN sys.indexes AS b ON a.object_id = b.object_id AND a.index_id = b.index_id;

    --INSERT 250 again

    SET NOCOUNT ON

    GO

    DECLARE @I INT

    SET @I = 1

    WHILE (@I <= 250)

    BEGIN

    INSERT INTO dbo.DECIPHER_DATA (COL1, COL2) VALUES (@I, REPLICATE('A', @I))

    SET @I = @I + 1

    END

    --UPDATE Statistics

    UPDATE STATISTICS DECIPHER_DATA WITH FULLSCAN, ALL;

    --Rerun fragmentation

    SELECT a.index_id, name, avg_fragmentation_in_percent

    FROM sys.dm_db_index_physical_stats (DB_ID(), OBJECT_ID(N'DECIPHER_DATA'),

    NULL, NULL, NULL) AS a

    JOIN sys.indexes AS b ON a.object_id = b.object_id AND a.index_id = b.index_id;

    --ReOrg

    ALTER INDEX [PK_DECIPHER_DATA] ON [dbo].[DECIPHER_DATA] REORGANIZE WITH ( LOB_COMPACTION = ON )

    --Rerun Fragmentation

    SELECT a.index_id, name, avg_fragmentation_in_percent

    FROM sys.dm_db_index_physical_stats (DB_ID(), OBJECT_ID(N'DECIPHER_DATA'),

    NULL, NULL, NULL) AS a

    JOIN sys.indexes AS b ON a.object_id = b.object_id AND a.index_id = b.index_id;

    However, in running this test which builds a table, inserts 500 rows, deletes 250 rows, and then reinserts 250 rows again we have found that neither reorg or rebuild actually removes the fragmentation.

    Is there a better way to determine fragmentation? Or are we going down the wrong path altogether? Any thoughts would be appreciated. Thanks.

  • Hi,

    if I were you, rather than spend time reinventing the wheel generating and testing a script to do this, I would just use the following excellent script from Michelle Ufford.

    http://sqlfool.com/2009/06/index-defrag-script-v30/

  • Thanks, I'll take a look at it. In addition can you help me understand how fragmentation is calculated? I would assume when a reorg or rebuild happens that the fragmentation would be decreased or removed altogether, but in the very simple script we ran, the fragmentation stayed same following the reorg.

  • aber (4/23/2010)


    Thanks, I'll take a look at it. In addition can you help me understand how fragmentation is calculated? I would assume when a reorg or rebuild happens that the fragmentation would be decreased or removed altogether, but in the very simple script we ran, the fragmentation stayed same following the reorg.

    This is due to the size of the table and the way SQL Server stores small tables. Small tables are stored in shared extents - and will not ever be fully defragmented.

    The general rule is to not worry about tables with less than 1000 pages - which you'll see when you review Michelle's utility.

    With that said, I am not sure how doing this is going to improve (fine-tune) your ETL process. It is not a bad idea to rebuild/reorganize your indexes - but in an ETL process it won't help. What really helps the ETL process is making sure you insert the data in the same order as the clustered index, and, depending on how much data is being inserted/updated - disabling the non-clustered indexes.

    For example, if you are reloading the table - disable all non-clustered indexes, insert in the same order as your clustered index and then rebuild all indexes (not reorganize, you can't when you disable them).

    On the other hand, if you are performing incremental updates - and the percentage of rows is very small compared to the number of rows in the table, disabling the non-clustered indexes and rebuilding might take longer than leaving them alone. Testing in your environment will tell you which is best.

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • This was the exact information that I was looking for, thanks! We have played around a bit with disabling and then rebuilding and did find that some of these larger tables were taking longer to rebuild than it was worth.

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply