Best Index Rebuild/Reorganize and Update Statistics Strategy

Question

Best Index Rebuild/Reorganize and Update Statistics Strategy

SQL_Surfer

SSCrazy Eights

Points: 8320
More actions
July 14, 2013 at 12:02 pm

#296384

I have a several databases in full recovery model around 200 GB. As soon I run maintenace plan, log fills out pretty quickly. Is it better to change recover model to simple while perfroming maintenance? And then change back to full after complete? Will this way not fill the log too quickly?
Also, doesn't see like we need to upate all the statistics or rebuild all indexes as maintaining all of them seems resource consuming. Is there a dmv that we can use to show which indexes/statistics are mostly used and just trying to do maintenace on those?
I am just worried the log getting full and we will have another problem.
I am on 2005.
This is kind of urgent as some of the query recently starting running longer as we had outdated statistics.
Please advise.

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply

george sibbald SSC Guru Points: 104210 More actions · Answer 1

bulk logged is the best mode to control growth during index rebuilds, then a full backup afterwards. Note log backups will still be large though. You could also try backing up the log more frequently during reindexing.

You are correct to try and only rebuild indexes that need it, the dmv for that is sys.dm_db_index_physical_stats. This will also reduce log activity. There are many scripts for that, I enclose the one from BOL which works just fine plus the crowd favorite is Olla Hallengrens suite of utilities for maintenance.

For those indexes which are only reorganised you will still need to update the stats.

---------------------------------------------------------------------

SQL_Surfer SSCrazy Eights Points: 8320 More actions · Answer 2

Thank you so much for your suggestion. How about is there a way to figure out which stats are mostly used and needs updated?

Another question...For two 150 GB database.....both of their ldf files are in the same folder...So, is it recommended to do maintenace one by one or we could do in parallel....I am kind of worried about doing parallel as that could fill up the log and more use of tempdb.

Another question...Is setting up Auto Update Statistics to true is a good practise? Then that way, we don't need to do maintenance on statistics?

george sibbald SSC Guru Points: 104210 More actions · Answer 3

SQL_Surfer (7/17/2013)
Thank you so much for your suggestion. How about is there a way to figure out which stats are mostly used and needs updated?
Another question...For two 150 GB database.....both of their ldf files are in the same folder...So, is it recommended to do maintenace one by one or we could do in parallel....I am kind of worried about doing parallel as that could fill up the log and more use of tempdb.
Another question...Is setting up Auto Update Statistics to true is a good practise? Then that way, we don't need to do maintenance on statistics?

The folder that holds the log files should be large enough to accommodate both at their maximum, you don't want to be continually shrinking them. Having said that reindex the databases consecutively if you can to reduce load.

It is best practice to leave auto update stats on. there are edge cases where stats can get updated at busy times when the server is under load.

Look for indexes that are heavily used, those are the ones whose stats are most likely need updating. the main case is an ever increasing clustered index, as only one row is added at a time its not enough to trigger an auto update stats but there is (recent and probably important) data in the table SQL has no metadata on. Use this query

-- Index Read/Write stats for a single table

SELECT OBJECT_NAME(s.[object_id]) AS [TableName],

i.name AS [IndexName], i.index_id,

SUM(user_seeks) AS [User Seeks], SUM(user_scans) AS [User Scans],

SUM(user_lookups)AS [User Lookups],

SUM(user_seeks + user_scans + user_lookups)AS [Total Reads],

SUM(user_updates) AS [Total Writes]

FROM sys.dm_db_index_usage_stats AS s

INNER JOIN sys.indexes AS i

ON s.[object_id] = i.[object_id]

AND i.index_id = s.index_id

WHERE OBJECTPROPERTY(s.[object_id],'IsUserTable') = 1

AND s.database_id = DB_ID()

AND OBJECT_NAME(s.[object_id]) = N'tablename'

GROUP BY OBJECT_NAME(s.[object_id]), i.name, i.index_id

ORDER BY [Total Writes] DESC, [Total Reads] DESC OPTION (RECOMPILE);

use either DBCC show_statistics to get date stats last updated, or the stats_date function, something like this

select 'index name' = i.name,

'stats date' = stats_date(i.object_id,i.index_id)

from sys.objects o, sys.indexes i

where o.name = 'tablename' and o.object_id = i.object_id

---------------------------------------------------------------------

george sibbald SSC Guru Points: 104210 More actions · Answer 4

george sibbald (7/18/2013)
SQL_Surfer (7/17/2013)
Thank you so much for your suggestion. How about is there a way to figure out which stats are mostly used and needs updated?
Another question...For two 150 GB database.....both of their ldf files are in the same folder...So, is it recommended to do maintenace one by one or we could do in parallel....I am kind of worried about doing parallel as that could fill up the log and more use of tempdb.
Another question...Is setting up Auto Update Statistics to true is a good practise? Then that way, we don't need to do maintenance on statistics?
The folder that holds the log files should be large enough to accommodate both at their maximum, you don't want to be continually shrinking them. Having said that reindex the databases consecutively if you can to reduce load.
It is best practice to leave auto update stats on. there are edge cases where stats can get updated at busy times when the server is under load.
Look for indexes that are heavily used, those are the ones whose stats are most likely need updating. the main case is an ever increasing clustered index, as only one row is added at a time its not enough to trigger an auto update stats but there is (recent and probably important) data in the table SQL has no metadata on. Use this query
-- Index Read/Write stats for a single table
SELECT OBJECT_NAME(s.[object_id]) AS [TableName],
i.name AS [IndexName], i.index_id,
SUM(user_seeks) AS [User Seeks], SUM(user_scans) AS [User Scans],
SUM(user_lookups)AS [User Lookups],
SUM(user_seeks + user_scans + user_lookups)AS [Total Reads],
SUM(user_updates) AS [Total Writes]
FROM sys.dm_db_index_usage_stats AS s
INNER JOIN sys.indexes AS i
ON s.[object_id] = i.[object_id]
AND i.index_id = s.index_id
WHERE OBJECTPROPERTY(s.[object_id],'IsUserTable') = 1
AND s.database_id = DB_ID()
AND OBJECT_NAME(s.[object_id]) = N'tablename'
GROUP BY OBJECT_NAME(s.[object_id]), i.name, i.index_id
ORDER BY [Total Writes] DESC, [Total Reads] DESC OPTION (RECOMPILE);
use either DBCC show_statistics to get date stats last updated, or the stats_date function, something like this
select 'index name' = i.name,
'stats date' = stats_date(i.object_id,i.index_id)
from sys.objects o, sys.indexes i
where o.name = 'tablename' and o.object_id = i.object_id

---------------------------------------------------------------------

SQL_Surfer SSCrazy Eights Points: 8320 More actions · Answer 5

Thank you. Thanks for the query to find out heavily used indexes. Using this, as far as statistics are concerened, we won't be able to capture _WA statistics but only the indexed based statistics. So, should we update all the statistics on the table? This would be very time consuming. Any idea how to pull the heavily used WA statistics?

george sibbald SSC Guru Points: 104210 More actions · Answer 6

SQL_Surfer (7/19/2013)
Thank you. Thanks for the query to find out heavily used indexes. Using this, as far as statistics are concerned, we won't be able to capture _WA statistics but only the indexed based statistics. So, should we update all the statistics on the table? This would be very time consuming. Any idea how to pull the heavily used WA statistics?

I don't know of a query to do that.

Have you tried updating all stats? updating the stats would only be time consuming if you run them all with full scan. run an sp_updatestats periodically if you have doubts about the age of your stats. Index rebuilds will update the stats, and you can rebuild the stats for those indexes reorganised.

---------------------------------------------------------------------

SQL_Surfer SSCrazy Eights Points: 8320 More actions · Answer 7

I ran the following statments

SELECT ps.database_id, ps.OBJECT_ID,

ps.index_id, b.name,

ps.avg_fragmentation_in_percent

FROM sys.dm_db_index_physical_stats (DB_ID(), NULL, NULL, NULL, NULL) AS ps

INNER JOIN sys.indexes AS b ON ps.OBJECT_ID = b.OBJECT_ID

AND ps.index_id = b.index_id

WHERE ps.database_id = DB_ID()

ORDER BY ps.OBJECT_ID

I see some of the index_id = 0 and name is NULL but with highger fragmentation. Then I looked up the object with index_id = 0 and turned out its a table. What does a table with higher fragmentation means? one of them was about 98%. How can we reduce the fragmentaion on the table?

Gail Shaw SSC Guru Points: 1004485 More actions · Answer 8

index_id of 0 and name of NULL means it's a heap, a table without a clustered index. You can't rebuild a heap.

And in case anyone posts the 'create a clustered index and then drop it' bad advice...

http://www.sqlskills.com/blogs/paul/a-sql-server-dba-myth-a-day-2930-fixing-heap-fragmentation/

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

SQL_Surfer SSCrazy Eights Points: 8320 More actions · Answer 9

Any idea why DB grows to 10GB more after index maintenance? Should I opt for shrinkfile?

Gail Shaw SSC Guru Points: 1004485 More actions · Answer 10

SQL_Surfer (7/21/2013)
Any idea why DB grows to 10GB more after index maintenance?

Because SQL had to rebuild the indexes somewhere, if there wasn't space for the new indexes, the file would have grown to make space.

Should I opt for shrinkfile?

If you want to completely undo everything that the index rebuild did, leave your indexes more fragmented than before you started, sure.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

SQL_Surfer SSCrazy Eights Points: 8320 More actions · Answer 11

Thanks Gail. Should I expect 10GB growth everytime I rebuild the index? This happened on only one DB that had snapshot. But I dropped snapshot before rebuilding index.

Gail Shaw SSC Guru Points: 1004485 More actions · Answer 12

SQL_Surfer (7/21/2013)
Thanks Gail. Should I expect 10GB growth everytime I rebuild the index?

No, you should expect the file to grow if there's not enough free space for new indexes to be created. If there is enough free space, then there's enough free space and there's no need to grow the file.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Jeff Moden SSC Guru Points: 1004748 More actions · Answer 13

Just to add to what has already been said...

You don't actually need to take a full backup to get thing back to a normal Point-In-time backup status after going to the Bulk Logged or even the SIMPLE recovery mode. You can simply change back to the FULL recovery mode if in Bulk Logged and you only need to do a DIFF backup if you were in the SIMPLE recovery mode. Of course, I do a FULL backup on my relatively small (only 200GB) databases every night anyway. Just remember that even in the Bulk Logged mode, a Point-In-Time restore can't be done for any log file where you were in the Bulk Logged mode. You can only use the whole logfile backup or not during those time frames.

You can save a whole lot of "growth" during index rebuilds if you partition (Table Partition or Partitioned View) the large tables and related indexes because they'll be treated as much smaller individual units. It takes a bit to set them up and a bit to setup the code for automatic maintenance on them but it's well worth it. If you do some "tricks" with using different file groups on the partitions, it can also allow for "Piecemeal Restores" where you can get the core of a database backup up and running very quickly (not initially loading large log/audit tables, for example) and then loading larger less important data over time after the initial restore.

Partitioning can also be a real time save for both index maintenance and backups. For example, if you have large audit tables, you don't have to rebuild the indexes on the temporally stagnant partitions (I divide them up by month) and you don't have to back them up but once or twice if they're in a separate file group for each partition.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Gail Shaw SSC Guru Points: 1004485 More actions · Answer 14

Jeff Moden (7/21/2013)
You can simply change back to the FULL recovery mode if in Bulk Logged ....

You can, but a log backup right before or after the switch is strongly recommended (doesn't really matter the order)

http://www.sqlservercentral.com/articles/Recovery+Model/89664/

Just remember that even in the Bulk Logged mode, a Point-In-Time restore can't be done for any log file where you were in the Bulk Logged mode. You can only use the whole logfile backup or not during those time frames.

A point in time restore can be done while in bulk-logged recovery, unless that log backup contains a minimally logged operation. Only log backups that contain minimally logged operations must be restored in full. If a log backup contains no minimally logged operations, then you can restore to any point within that, just as in full recovery.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass