How do i identify resources needed to defrag/rebuild indexes

  • Im currently working for a company who have a 24x 7 system. it is literally that, zero downtime. They have been very very VERY lucky the past year or so, and have had no catastrophic failures.

    Their system is very very IO bound, and also bound severly by storage constraints. They are moving to a new architecture on much better servers wtih better servers but until then the ship still needs to stay afloat.

    The current issue is performance. The index maintenance has been disabled as each time it ran it caused transactions to be blocked, causing customers to be timed out of the application running over the database. This is not allowed so, in a fit of wisdom they disabled the index maintenance.

    Now queries are starting to run slowly, also causing timeouts on large reports. big supprise i know. I want to start maintenance of the indexes, but there are 50+ databases on the server, each with many tables, each with many indexes.

    I am unsure how to approach the problem. Manually fixing each index one by one is out of the question.

    Running all at once seems to be out of the question also. So I need help working out a few things:

    1. Best method/process to allow index maintenance to proceed, while consuming minimal storage space (tempdb) and not noticably cause contention on any tables.

    2. Knowledge of whtat stats i need to gather in order to identify the resources required to perform the process defined in step1.

    Big ask's, i know, but if anyone can point me in a right direction i would appreciate it.

    Thanks!

  • 1. Maintain only usable indexes. You can check which indexes are been used running dmv queries (sys.dm_db_index_operational_stats and sys.dm_db_index_usage_stats)

    2. Be sure you don't have significative scan tables: Check profiler and missing indexes (sys.dm_db_missing_index_details,sys.dm_db_missing_index_group_stats, sys.dm_db_missing_index_groups, sys.dm_db_missing_index_columns). A good indexed database will reduce blocks during reindex processes.

    3. Check tempdb space and filegroup distribution.

    4. Reindex only fragmented indexes. This will drastically reduce time, blocks & tempdb use (http://www.sqlservercentral.com/scripts/Indexing/31475/)

    5. Monitor index fragmetation rate and adjust fill factor strategies.

    6. Use INDEXDEFRAG instead of dbreindex

    Good luck

    Carlos

  • great ideas there to start creation of a plan , thanks carlos.

    I would like to know if anyone has expiremented with the issues caused by online index rebuilding and reorganizing. Specifically:

    1. in a reorg ( which is always online), is the index unusable while its being reorganized, i.e potential for many more table scans ?

    2. in an online rebuild, will this consume as much space as if the index was being created for the first time? if so, although a reorg may take longer, if i am constrained for space in tempdb or the current database, wouldnt i be safer doing a reorg, even if it takes longer?

    3. how can i calculate the storage space required in tempdb, or the current db for rebuilding/creating a particular index?

  • You may want to read these blog posts by Jason Strate (A great deal of T-SQL is included to assist you).

    [http://www.sqlservercentral.com/blogs/stratesql/archive/2010/10/13/return-of-index-analysis-part-3.aspx/url]

    Then select the most used indexes, and from those the most badly fragmentated, in each DB and with that information ... set up a job to perform the index rebuild/reorganize for a few indexes during each run of the job... a lot of work, but as the situation improves you will have less timeouts from clients and who knows you might be able to use the time saved to update even more indexes in each job run.

    If everything seems to be going well, you have obviously overlooked something.

    Ron

    Please help us, help you -before posting a question please read[/url]
    Before posting a performance problem please read[/url]

  • winston Smith (7/31/2011)


    great ideas there to start creation of a plan , thanks carlos.

    I would like to know if anyone has expiremented with the issues caused by online index rebuilding and reorganizing. Specifically:

    1. in a reorg ( which is always online), is the index unusable while its being reorganized, i.e potential for many more table scans ?

    2. in an online rebuild, will this consume as much space as if the index was being created for the first time? if so, although a reorg may take longer, if i am constrained for space in tempdb or the current database, wouldnt i be safer doing a reorg, even if it takes longer?

    3. how can i calculate the storage space required in tempdb, or the current db for rebuilding/creating a particular index?

    Interesting "Catch 22" we have going on here. Reorgs do just that... they reorganze the index but no space is returned to the system. Clustered Indexes can suffer page splits and Non-Clustered Indexes can suffer extent splits, as well. Depending on the number of rows that are "sorted" into whatever type of split you have, you could be wasting nearly 50% of the space on the disk just in split indexes alone.

    My (very strong) recommendation would to lookup "sys.dm_db_index_physical_stats dynamic management function" in Books Online. You'll also find a handy script in the examples that you can easily tweek to do what you want.

    My recommendation would also be to concentrate on the longest running queries. Find out which indexes they're using and see what their frag levels are. Follow your nose on what to do after that.

    I'll caution against using any SQL Server suggestion on missing indexes or statistics. It has been known to recommend low cardinality indexes for some SELECTs which are absolutely the worst thing you can have for INSERTS and can be a leading cause of TimeOuts on the GUI side of the house. On the other hand, adding a missing stat or index is sometimes all it takes.

    On those same long running queries, you need to look for things like the SARGablity of the WHERE clauses and ON clauses and in the execution plans for arrows that are fatter than the tables they work with which can be a sign of insufficient criteria leading to hidden/accidental many-to-many joins.

    There are a thousand things to look for in the code of long running batches. Sometimes a simple tweek or properly written index will solve the problem. Other times, it takes a little "Divide'n'Conquer' and other times it'll take a major rewrite Again... follow your nose on the biggest bang for the buck in the least amount of time.

    When you think you're all done, start over and do it again. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • thanks guys. lots of good areas to examine. you have given me the basis of a plan of action.

    Id still like to know if there is a way of identifying the storage requirements to rebuild an index in tempdb or in the current database. anyway to work this out?

  • Should be roughly 1.5X for tempdb.

    The thing is that you'll do this 1 index at a time, so the real idea is to find the biggest of them all and account only for that one.

    I didn't save the link but this came from Jason IIRC. That gives you VERY detailed info about tables and indexes but it takes a heck of a lot of power to run. Do that on a test server...

    Edit : http://jasonbrimhall.info/2010/05/19/tablespace-update/

    SET IMPLICIT_TRANSACTIONS OFF

    SET NOCOUNT ON

    /* Part I */

    --Drop Table #indstats

    IF exists (SELECT * FROM tempdb.sys.objects WHERE name like '%#indstats%')

    BEGIN

    DROP TABLE tempdb.dbo.#indstats

    END

    BEGIN

    CREATE TABLE #indstats (

    IndStatsID INT PRIMARY KEY CLUSTERED

    ,database_id BIGINT

    ,index_id BIGINT

    ,IndexSizeMB DECIMAL(16,1)

    ,OBJECT_IDBIGINT

    );

    END

    INSERT INTO #indstats (IndStatsID,database_id,index_id,OBJECT_ID,IndexSizeMB)

    SELECT Row_Number() OVER (ORDER BY OBJECT_ID) AS IndStatsID

    ,database_id,index_id,OBJECT_ID

    ,CONVERT(DECIMAL(19,2),(SUM(ps.page_count))) * 8 /1024 AS IndexSizeMB

    FROM sys.dm_db_index_physical_stats(DB_ID(),null,NULL,NULL,'DETAILED') ps

    GROUP BY database_id,OBJECT_ID,index_id;

    /* Part II */

    DECLARE @dbsize DECIMAL(19,2)

    SET NOCOUNT ON

    /*

    ** Summary data.

    */

    BEGIN

    SELECT @dbsize = SUM(CONVERT(DECIMAL(19,2),CASE WHEN type = 0 THEN SIZE ELSE 0 END)) * 8/1024

    FROM sys.database_files

    END

    /* Part III */

    BEGIN

    WITH RegData AS (

    SELECT a.container_id,p.OBJECT_ID,p.index_id,us.database_id

    ,FileGroupName = FILEGROUP_NAME(a.data_space_id)

    ,TableName = OBJECT_NAME(p.OBJECT_ID)

    ,NumRows = p.ROWS

    ,UsedPages = IsNull(a.used_pages,0)

    ,TotalPages = IsNull(a.total_pages,0)

    ,DataSizeMB = CONVERT(DECIMAL(19,2),IsNull(a.used_pages,0)) * 8/1024

    ,IndexSizeMB = CASE WHEN ps.index_id < 2 THEN 0 ELSE ps.IndexSizeMB END

    ,UserRequests = IsNull(us.user_seeks,0) + IsNull(us.user_scans,0) + IsNull(us.user_lookups,0)

    ,UserUpdates = IsNull(us.user_updates,0)

    ,LastUpdate = IsNull(us.last_user_update,null)

    ,RatioRequestsToUpdates = CAST(IsNull(us.user_seeks,0)

    + IsNull(us.user_scans,0) + IsNull(us.user_lookups,0) AS REAL)

    / CAST(CASE us.user_updates WHEN 0 THEN 1 ELSE us.user_updates END AS REAL)

    FROM sys.allocation_units a

    INNER Join sys.partitions p

    ON p.hobt_id = a.container_id

    And a.type = 1

    LEFT Outer Join sys.dm_db_index_usage_stats us

    ON us.OBJECT_ID = p.OBJECT_ID

    And us.index_id = p.index_id

    And us.database_id = DB_ID()

    LEFT Outer Join #indstats ps

    ON p.index_id = ps.index_id

    And ps.database_id = DB_ID()

    And p.OBJECT_ID = ps.OBJECT_ID

    --WHERE OBJECTPROPERTY(p.OBJECT_ID,'IsMSShipped') = 0

    )

    , LOBData AS (

    SELECT a.container_id,p.OBJECT_ID,p.index_id,us.database_id

    ,FileGroupName = FILEGROUP_NAME(a.data_space_id)

    ,TableName = OBJECT_NAME(p.OBJECT_ID)

    ,NumRows = p.ROWS

    ,UsedPages = IsNull(a.used_pages,0)

    ,TotalPages = IsNull(a.total_pages,0)

    ,DataSizeMB = CONVERT(DECIMAL(19,2),IsNull(a.used_pages,0)) * 8/1024

    ,IndexSizeMB = CASE WHEN ps.index_id < 2 THEN 0 ELSE ps.IndexSizeMB END

    ,UserRequests = IsNull(us.user_seeks,0) + IsNull(us.user_scans,0) + IsNull(us.user_lookups,0)

    ,UserUpdates = IsNull(us.user_updates,0)

    ,LastUpdate = IsNull(us.last_user_update,null)

    ,RatioRequestsToUpdates = CAST(IsNull(us.user_seeks,0)

    + IsNull(us.user_scans,0) + IsNull(us.user_lookups,0) AS REAL)

    / CAST(CASE us.user_updates WHEN 0 THEN 1 ELSE us.user_updates END AS REAL)

    FROM sys.allocation_units a

    INNER Join sys.partitions p

    ON p.partition_id = a.container_id

    And a.type = 2

    LEFT Outer Join sys.dm_db_index_usage_stats us

    ON us.OBJECT_ID = p.OBJECT_ID

    And us.index_id = p.index_id

    And us.database_id = DB_ID()

    LEFT Outer Join #indstats ps

    ON p.index_id = ps.index_id

    And ps.database_id = DB_ID()

    And p.OBJECT_ID = ps.OBJECT_ID

    --WHERE OBJECTPROPERTY(p.OBJECT_ID,'IsMSShipped') = 0

    )

    , OverFlowData AS (

    SELECT a.container_id,p.OBJECT_ID,p.index_id,us.database_id

    ,FileGroupName = FILEGROUP_NAME(a.data_space_id)

    ,TableName = OBJECT_NAME(p.OBJECT_ID)

    ,NumRows = p.ROWS

    ,UsedPages = IsNull(a.used_pages,0)

    ,TotalPages = IsNull(a.total_pages,0)

    ,DataSizeMB = CONVERT(DECIMAL(19,2),IsNull(a.used_pages,0)) * 8/1024

    ,IndexSizeMB = CASE WHEN ps.index_id < 2 THEN 0 ELSE ps.IndexSizeMB END

    ,UserRequests = IsNull(us.user_seeks,0) + IsNull(us.user_scans,0) + IsNull(us.user_lookups,0)

    ,UserUpdates = IsNull(us.user_updates,0)

    ,LastUpdate = IsNull(us.last_user_update,null)

    ,RatioRequestsToUpdates = CAST(IsNull(us.user_seeks,0)

    + IsNull(us.user_scans,0) + IsNull(us.user_lookups,0) AS REAL)

    / CAST(CASE us.user_updates WHEN 0 THEN 1 ELSE us.user_updates END AS REAL)

    FROM sys.allocation_units a

    INNER Join sys.partitions p

    ON p.hobt_id = a.container_id

    And a.type = 3

    LEFT Outer Join sys.dm_db_index_usage_stats us

    ON us.OBJECT_ID = p.OBJECT_ID

    And us.index_id = p.index_id

    And us.database_id = DB_ID()

    LEFT Outer Join #indstats ps

    ON p.index_id = ps.index_id

    And ps.database_id = DB_ID()

    And p.OBJECT_ID = ps.OBJECT_ID

    --WHERE OBJECTPROPERTY(p.OBJECT_ID,'IsMSShipped') = 0

    ), IndexSum AS (

    SELECT a.OBJECT_ID

    ,AllDataSizeMB = SUM(CASE WHEN a.index_id in (0,1)

    THEN IsNull(a.DataSizeMB,0) + IsNull(p2.DataSizeMB,0) + IsNull(p3.DataSizeMB,0)

    ELSE IsNull(p2.DataSizeMB,0) + IsNull(p3.DataSizeMB,0)

    END)

    FROM RegData a

    LEFT Outer Join LOBData p2

    ON p2.container_id = a.container_id

    LEFT Outer Join OverFlowData p3

    ON p3.container_id = a.container_id

    GROUP BY a.OBJECT_ID

    ), SummaryInfo AS (

    SELECT

    TableName = MAX(a.TableName)

    ,InRowDataSizeMB = SUM(IsNull(a.DataSizeMB,0))

    ,LOBDataSizeMB = SUM(IsNull(p2.DataSizeMB,0))

    ,OFlowDataSizeMB = SUM(IsNull(p3.DataSizeMB,0))

    ,NumRows = MAX(COALESCE(a.NumRows,p2.NumRows,p3.NumRows,0))

    ,AllUsedPages = SUM(IsNull(a.UsedPages,0) + IsNull(p2.UsedPages,0) + IsNull(p3.UsedPages,0))

    ,AllPages = SUM(IsNull(a.TotalPages,0) + IsNull(p2.TotalPages,0) + IsNull(p3.TotalPages,0))

    ,FreeDataSpaceMB = CONVERT(DECIMAL(19,2),

    SUM(IsNull(a.TotalPages,0) + IsNull(p2.TotalPages,0) + IsNull(p3.TotalPages,0))

    - SUM(IsNull(a.UsedPages,0) + IsNull(p2.UsedPages,0) + IsNull(p3.UsedPages,0)))* 8 / 1024

    ,AllDataSizeMB = MAX(ids.AllDataSizeMB)

    ,IndexSizeMB = SUM(IsNull(a.IndexSizeMB,0))

    + SUM(IsNull(p2.IndexSizeMB,0)) + SUM(IsNull(p3.IndexSizeMB,0))

    ,UserRequests_Cnt = AVG(IsNull(a.UserRequests,0)

    + IsNull(p2.UserRequests,0) + IsNull(p3.UserRequests,0))

    ,UserUpdates_Cnt = AVG(IsNull(a.UserUpdates,0) + IsNull(p2.UserUpdates,0) + IsNull(p3.UserUpdates,0))

    ,LastUpdate = MAX(COALESCE(a.LastUpdate,p2.LastUpdate,p3.LastUpdate,null))

    ,DatabaseSize = @dbsize

    FROM RegData a

    LEFT Outer Join LOBData p2

    ON p2.container_id = a.container_id

    LEFT Outer Join OverFlowData p3

    ON p3.container_id = a.container_id

    LEFT Outer Join sys.indexes i

    ON i.OBJECT_ID = a.OBJECT_ID

    And i.index_id = a.index_id

    LEFT Outer Join IndexSum ids

    ON i.OBJECT_ID = ids.OBJECT_ID

    --WHERE filegroup_name(a.data_space_id) = 'Primary'

    GROUP BY a.OBJECT_ID

    )

    SELECT TableName,NumRows,InRowDataSizeMB,LOBDataSizeMB,OFlowDataSizeMB

    ,AllUsedPages,AllPages

    ,FreeDataSpaceMB,AllDataSizeMB,IndexSizeMB

    ,TableSizeMB = AllDataSizeMB + IndexSizeMB + FreeDataSpaceMB

    ,UserRequests_Cnt,UserUpdates_Cnt,LastUpdate

    ,PercentofDB = ((IndexSizeMB + AllDataSizeMB) / DatabaseSize) * 100

    ,DatabaseSize

    FROM SummaryInfo

    ORDER BY PercentofDB DESC, NumRows DESC

    END

  • wow! Thanks for that. There's a lot going on in that query. il give it a good going over tomorrow in work. appreciate it. Thanks all!

  • winston Smith (8/1/2011)


    wow! Thanks for that. There's a lot going on in that query. il give it a good going over tomorrow in work. appreciate it. Thanks all!

    Thank Jason, I only saved a bookmark ;-).

  • FYI - From my MCDBA training days, the recommendation was 1.25 * size of largest index. For Clustered indexes, that would be 1.25 * size of the table. Therefore, find the biggest table and multiply by 1.25 for size estimation needed in TempDB.

    The previous poster is correct, you should run these one at a time. I mix mine up, and do a group on Monday night, a group on Tuesday night, etc., and then varying the time for each, say at least 30 minutes apart for the ones being run on any night. Definitely pick the times to be when the server is least busy.

Viewing 10 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply