Fragmentation on partitioned tables

  • Hi,

    I’m looking for some tips on fragmentation.

    A couple of quick facts about the database -

    29 tables on same partition scheme

    424,330,993 total records and counting

    Total database size is about 500 GB and could triple over the next several months.

    Most indexes are clustered.

    This is not an OLTP database in the traditional sense. It is used for processing batches of records so bulk inserts, updates and deletes happen frequently across ranges of data. The avg_fragmentation_in_percent for some of our partitions is over 90. My question is what is the best approach to get this unfragmented (within SQL Server for now) of the 2 methods below. Or is there something better?

    One of my concerns is that I’m not sure that reorganizing updates statistics. Another is just the time it’s going to take to rebuild these indexes. If I have to kill the process, will it have to rollback and what kind of risks are there if I do have to kill it?

    Method 1 –

    In Books Online under Alter Index, it states:

    “If ALL is specified with PARTITION = partition_number, all indexes must be aligned. This means that they are partitioned based on equivalent partition functions. Using ALL with PARTITION causes all index partitions with the same partition_number to be rebuilt or reorganized. For more information about partitioned indexes, see Partitioned Tables and Indexes.”

    Please give an example on the syntax for this. I was thinking something like this for each table but it seems, in my small mind, thinking about aligning indexes on one partition scheme that running this on separate days for different tables would be defeating the purpose.

    alter index all on tbMyTable

    rebuild with (fillfactor = 80, sort_in_tempdb = on,

    statistics_norecompute = on);

    Method 2 -

    -- Script from BOL

    SET NOCOUNT ON;

    DECLARE @objectid int;

    DECLARE @indexid int;

    DECLARE @partitioncount bigint;

    DECLARE @schemaname nvarchar(130);

    DECLARE @objectname nvarchar(130);

    DECLARE @indexname nvarchar(130);

    DECLARE @partitionnum bigint;

    DECLARE @partitions bigint;

    DECLARE @frag float;

    DECLARE @command nvarchar(4000);

    -- Conditionally select tables and indexes from the sys.dm_db_index_physical_stats function

    -- and convert object and index IDs to names.

    SELECT

    object_id AS objectid,

    index_id AS indexid,

    partition_number AS partitionnum,

    avg_fragmentation_in_percent AS frag

    INTO #work_to_do

    FROM sys.dm_db_index_physical_stats (DB_ID('MyDB'), NULL, NULL , NULL, 'LIMITED')

    WHERE avg_fragmentation_in_percent > 10.0 AND index_id > 0;

    -- Declare the cursor for the list of partitions to be processed.

    DECLARE partitions CURSOR FOR SELECT * FROM #work_to_do;

    -- Open the cursor.

    OPEN partitions;

    -- Loop through the partitions.

    WHILE (1=1)

    BEGIN;

    FETCH NEXT

    FROM partitions

    INTO @objectid, @indexid, @partitionnum, @frag;

    IF @@FETCH_STATUS < 0 BREAK;

    SELECT @objectname = QUOTENAME(o.name), @schemaname = QUOTENAME(s.name)

    FROM sys.objects AS o

    JOIN sys.schemas as s ON s.schema_id = o.schema_id

    WHERE o.object_id = @objectid;

    SELECT @indexname = QUOTENAME(name)

    FROM sys.indexes

    WHERE object_id = @objectid AND index_id = @indexid;

    SELECT @partitioncount = count (*)

    FROM sys.partitions

    WHERE object_id = @objectid AND index_id = @indexid;

    -- 30 is an arbitrary decision point at which to switch between reorganizing and rebuilding.

    IF @frag < 30.0

    SET @command = N'ALTER INDEX ' + @indexname + N' ON ' + @schemaname + N'.' + @objectname + N' REORGANIZE';

    IF @frag >= 30.0

    SET @command = N'ALTER INDEX ' + @indexname + N' ON ' + @schemaname + N'.' + @objectname + N' REBUILD';

    IF @partitioncount > 1

    SET @command = @command + N' PARTITION=' + CAST(@partitionnum AS nvarchar(10));

    --EXEC (@command);

    PRINT @command;

    END;

    -- Close and deallocate the cursor.

    CLOSE partitions;

    DEALLOCATE partitions;

    -- Drop the temporary table.

    DROP TABLE #work_to_do;

    GO

    -- Returns (small sample)

    ALTER INDEX [idx_tbMyTable_2] ON [dbo].[tbMyTable] REORGANIZE PARTITION=4

    ALTER INDEX [idx_tbMyTable_2] ON [dbo].[tbMyTable] REBUILD PARTITION=5

    ALTER INDEX [idx_tbMyTable_2] ON [dbo].[tbMyTable] REBUILD PARTITION=6

    ALTER INDEX [idx_tbMyTable_2] ON [dbo].[tbMyTable] REBUILD PARTITION=7

    ALTER INDEX [idx_tbMyTable_2] ON [dbo].[tbMyTable] REBUILD PARTITION=9

    ALTER INDEX [idx_tbMyTable_2] ON [dbo].[tbMyTable] REORGANIZE PARTITION=10

    ALTER INDEX [idx_tbMyTable_2] ON [dbo].[tbMyTable] REORGANIZE PARTITION=11

    ALTER INDEX [idx_tbMyTable_2] ON [dbo].[tbMyTable] REORGANIZE PARTITION=13

    Thanks for your time and any feedback.

  • Here are some quick tips for you:

    - reorganizing an index will never update stats - the operation doesn't have a complete view of the index so can't update stats. I tried to figure out a way to do this when I wrote the original DBCC INDEXDEFRAG but couldn't.

    - when you rebuild an index, why would you not want it to update stats at the same time?

    - rebuilding a single partition cannot be done online (but rebuilding an index can), reorganizing a partition/index is always online

    I'd go with your option of per-partition fragmentation removal.

    I'd also look at your schema and access patterns to see if there's something inherent in them that would cause fragmentation (e.g. GUID as a primary key, updating variable-length columns to be longer, etc). One last thing - before removing fragmentation, make sure that the queries that use these indexes will benefit from removing fragmentation.

    Hope this helps

    Paul Randal
    CEO, SQLskills.com: Check out SQLskills online training!
    Blog:www.SQLskills.com/blogs/paul Twitter: @PaulRandal
    SQL MVP, Microsoft RD, Contributing Editor of TechNet Magazine
    Author of DBCC CHECKDB/repair (and other Storage Engine) code of SQL Server 2005

  • Thanks, Paul. That helps quite a bit.

    I always assumed just inserting and deleting so many records as frequently as we do would cause fragmentation. There are several columns in the varchar(100 – 5000) range most of which don’t get updated but I’ll be taking a closer look. Is it better to have a fixed column length as far as fragmentation is concerned and how would that affect performance?

    Is it good practice then to update statistics after reorganizing an index?

    Thanks again.

  • oh - I didn't mean those were the only things that could cause fragmentation. Insert and deletes will cause fragmentation too, depending on their pattern and position within the index.

    As far as variable length columns are concerned - the problem comes when an index is created with a 0 fillfactor so the rows get packed together and there's no space on the page. If a variable length value is then updated to be longer than it was when the row was created, it's likely tocause a page-split (and hence fragmentation). This doesn't mean variable length columns are bad - it oculd just be your pattern of updates that causes fragmentation. Moving to a fixed-length column could have space wastage implications. My favorite answer applies here - it depends! 🙂

    Yes, updating stats after a reorganize can be a good thing.

    Paul Randal
    CEO, SQLskills.com: Check out SQLskills online training!
    Blog:www.SQLskills.com/blogs/paul Twitter: @PaulRandal
    SQL MVP, Microsoft RD, Contributing Editor of TechNet Magazine
    Author of DBCC CHECKDB/repair (and other Storage Engine) code of SQL Server 2005

  • Thanks Paul - it's great having you on the forums!

    I appreciate your help.

  • In case you would go for the index-level option I have a stored procedure that might be helpful. (It could be that you have to keep the tables online and therefore cannot do partition-level rebuild.)

    The stored procedure does index rebuild online or offline, index reorganization, statistics update, index reorganization and statistics update or nothing based on fragmentation level, LOB existence and index size.

    http://ola.hallengren.com/sql-server-index-and-statistics-maintenance.html

    Ola Hallengren

    http://ola.hallengren.com

  • Hey Paul,

    Really is great to have the guy who wrote the code answering questions. My partition is based on a datetime datatype. Partition works like a charm but was heavily fragmented (85-97%) range. I was able to run some code to rebuild the not clustered indexes but the clustered indexes do seem to be getting rebuilt - after running the script for 24 hours. Since rebuilding an index on a partitioned table online is feasible, what are some of the gotchas of rebuilding the clustered index of that the partition is on? Thanks a million. I really appreciate the back and forth.

  • If you can insert records in the order of the clustered index, you won't have as much fragmentation because the inserts are appended onto the end of the index instead of being rammed into an existing page and causing a split.

    This method is not the solution to every problem because you might need to create covering indexes which will take more space in order to perform well when retrieving the data.

    Sometimes people don't think about load order, which is why I bring it up.

    Thanks

  • Good point. In this case we're partitioning by the date because we want to archive old data that isn't necessarily organized. It's an aggreggation from 8 DBs where the they each have data from various sections of the partition. After my data is loaded and I rebuild, the clustered index on a partition column, what is an "acceptable" amount of fragmentation on a table partitioned by a datetime column?

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply