Rebuilding Stats: Twice or Not At All

  • Andy Warren

    SSC Guru

    Points: 119676

    Comments posted to this topic are about the item Rebuilding Stats: Twice or Not At All

  • Perry Whittle

    SSC Guru

    Points: 233784

    Hi Andy

    this was exactly the same line of thinking i had. Very informative. I think you're right no-one ever changes the default for stats during the create index

    -----------------------------------------------------------------------------------------------------------

    [font="Tahoma"]"Ya can't make an omelette without breaking just a few eggs"[/font] 😉

  • John.Sansom

    SSCertifiable

    Points: 7335

    Thanks Andy!


    John Sansom (@sqlBrit) | www.johnsansom.com

  • Stan Vaughan

    SSC Journeyman

    Points: 91

    Is it not true that the Automatic Update of statistics is only invoked if a certain percentage of of rows are updated in the underlying table. So if you have a very large table (in terms of rows) that an update of stats may not occur everytime an insert is done even if this insert constitutes thousands of rows (i.e. in a table that has millions) ?

  • Andy Warren

    SSC Guru

    Points: 119676

    Thats correct. MS has a whitepaper on it, stats get updated when you go from zero to not zero rows, when it crosses 500 changes, and when it crosses 500 changes plus 20%.

    20% of even 20 million rows is a lot, but it depends on how your data is being modified as to how important it is to update stats more often. The biggest time when it matters is when the data crosses the threshold from where it would do a bookmark lookup to not.

  • alen teplitsky

    SSC-Dedicated

    Points: 30014

    back in the sql 2000 days we used to run dbcc reindex and indexdefrag depending on the frag level and then updating statistics. and we had some strange problems sometimes where a query would not use an index even though it had to.

    one time we ran a query on the publisher and a subscriber. query hit 1 table and the indexes were the same on both ends. on the publisher we would get an index scan and the subscriber an index seek. had a case with PSS and asked them if they could give us a quick answer. he said check the statistics. we did and they were different. so we stopped updating statistics and alter index only does rebuilds. before that we updated stats with full scan once a week and still didn't solve it. only rebuilding indexes and not updating the statistics seems to work. probably a bug where it rebuilds only statistics

  • Steve Howard-434180

    SSC Enthusiast

    Points: 129

    just to quibble with words a little bit: sp_updatestats only checks to see if rowmodctr is 0 or not. If it is not 0, it updates stats. That's a little different from only updating if the statistics need it. It is also worth considering that the rowmodctr that it checks was the actual indicator in SQL 2000, but is only an approximation in SQL 2005. The colmodctr that is actually used for autoupdate of statistics in 2005 is not visible to us.

    Despite what a whitepaper may say, I've seen in dbcc show_statistics times when on very large indexes (Very large, as in > 1 billion rows) that the sampling used was less than 1 percent. That's probably a sufficient sampling most times, but can lead to skewed statistics on the very rare occasion. The times where I have seen skewed statistics, there was no mistaking what it did to performance. In cases like these, if statistics are rebuilt occasionally using a higher sampling, then the threshold of modifications that leads to an autoupdate of statistics is never reached, so the statistics don't get updated with the smaller smapling that the autoupdate stats uses.

    One of the biggest reasons I can think of to update statistics periodically (nightly if you have slow time during the night) is that the modctr is set back to 0 when you update, and this makes it unlikely that the number of modifications during peak business will cross the threshold that trips autoupdate of stats on a table. The autoupdate should be used like a safety valve in case that threshold is reached, but minimizing autoupdate during your busiest time (by proactively updating during slow time) is a good practice.

  • magarity kerns

    SSCrazy

    Points: 2765

    What that means is that each time you rebuild (not reorg/defrag) the statistics related to the index will be built at the same time

    I couldn't really figure out this sentence; are you saying the index is built when the statistics are rebuilt? It seems more likely that the statistics are refreshed if the index is rebuilt. Pardon if this is a dumb question.

  • alen teplitsky

    SSC-Dedicated

    Points: 30014

    when you rebuild an index the statistics are also rebuilt for that index. if you rebuild an index and then run update statistics without a full scan it's the equivelant of rebuilding stats with a full scan and then again without a full scan. at least for some of the stats on that table

  • Anipaul

    SSC-Insane

    Points: 24681

    Nice one....

  • YeshuaAgapao

    Hall of Fame

    Points: 3495

    DBCC DBREINDEX and ALTER INDEX REBUILD rebuilds statistics with the RESAMPLE option - it uses the sample percentage that was used on the last manual or auto update. It also resets sysindexes.rowmodctr

    I used this code to test. Change the @TableName and @IndexName assignments to a table/index with statistics updated to a non-default sample percent but has had time to have change activity on it.

    DECLARE @SQL nVarChar(4000), @TableName sysname, @IndexName sysname

    SELECT @TableName='table', @IndexName='Index in table'

    --Test reorganize statistics behavior

    SELECT @SQL='

    DBCC SHOW_STATISTICS(@TableName, @IndexName) WITH STAT_HEADER

    SELECT id, name, rowmodctr FROM sysindexes WHERE id=OBJECT_ID(@TableName) AND name=@IndexName

    ALTER INDEX ' + @IndexName + ' ON ' + @TableName + ' REORGANIZE

    DBCC SHOW_STATISTICS(@TableName, @IndexName) WITH STAT_HEADER

    SELECT id, name, rowmodctr FROM sysindexes WHERE id=OBJECT_ID(@TableName) AND name=@IndexName

    '

    EXECUTE sp_executeSQL @SQL, N'@TableName sysname, @IndexName sysname', @TableName=@TableName, @IndexName=@IndexName

    --Test rebuild statistics behavior

    SELECT @SQL='

    DBCC SHOW_STATISTICS(@TableName, @IndexName) WITH STAT_HEADER

    SELECT id, name, rowmodctr FROM sysindexes WHERE id=OBJECT_ID(@TableName) AND name=@IndexName

    ALTER INDEX ' + @IndexName + ' ON ' + @TableName + ' REBUILD WITH (ONLINE=ON)

    DBCC SHOW_STATISTICS(@TableName, @IndexName) WITH STAT_HEADER

    SELECT id, name, rowmodctr FROM sysindexes WHERE id=OBJECT_ID(@TableName) AND name=@IndexName

    '

    EXECUTE sp_executeSQL @SQL, N'@TableName sysname, @IndexName sysname', @TableName=@TableName, @IndexName=@IndexName

    Is there a undocumented way to get to colmodctr through the system tables in the hidden resource database?

  • Isabelle2378

    SSCertifiable

    Points: 6158

    Thanks for pointing that out. I was doing that exact thing! I did not realize that our indexes have that option set and my job was updating statistics after rebuilding the indexes. I have removed that step now. I don't need the server doing all that extra work if it doesn't have to! 😀

    Great article.

    Isabelle

    Thanks!
    Bea Isabelle

  • Steve Howard-434180

    SSC Enthusiast

    Points: 129

    YeshuaAgapao, I have to disagree with you on rebuild using a resample to update statistics. It definitely uses a fullsample. Here is an example I just did on a test table I created in tempdb to test this:

    dbcc show_statistics('forDemo3', 'PK__forDemo3__31EC6D26')

    update statistics forDemo3 with SAMPLE 10 PERCENT

    ALTER INDEX ALL on forDemo3 REBUILD

    dbcc show_statistics('forDemo3', 'PK__forDemo3__31EC6D26')

    the two results were:

    Name Updated Rows Rows Sampled Steps Density Average key length String Index

    PK__forDemo3__31EC6D26 Jul 29 2008 12:28PM 2000000 326553 181 1 4 NO

    Name Updated Rows Rows Sampled Steps Density Average key length String Index

    PK__forDemo3__31EC6D26 Jul 29 2008 12:30PM 2000000 2000000 2 1 4 NO

    In the second results of dbcc show_statistics, you can definitely see that fullscan was used.

    You are correct that rowmodctr is reset when the indexes are rebuilt - statistics are updated at that time, so it must be reset.

  • YeshuaAgapao

    Hall of Fame

    Points: 3495

    Oh cool. I just tested that myself. I wouldn't think Microsoft would be inconsistent like that. A 25% sampling became 100% after rebuilding an index. I only tested on a full-scanned index because people (including me) here were paranoid about index rebuilding using normal default automatic statistics update sample percentages.

  • sbateman-803743

    SSC Veteran

    Points: 222

    thnx for the information

    Has any one got a script that will show status of all database.indices Statistics settings including last update.date, so that I can identify incorrect settings and days since last update.

Viewing 15 posts - 1 through 15 (of 18 total)

You must be logged in to reply to this topic. Login to reply