March 18, 2009 at 3:00 am
GilaMonster (3/18/2009)
Tim Walker (3/17/2009)
GilaMonster (3/17/2009)
The usual rule-of-thumb is 1000 pages to a table/index before you needs to worry about fragmentation.This is a common view but I don't agree, I think 64-ish pages is a sensible cutoff.
The 1000 page rule-of-thumb comes from Paul Randal. Apparently (he talked about this during one of his recent presentations) people kept on asking him what size tables should be defragmented. He guessed, based on what he knew of the storage engine, then went back and tested later. It turned out in testing that around 1000 pages was the point where the effects of fragmentation on IO speed became noticeable.
Thanks for that great feedback, you can't argue with that from an I/O speed perspective.
My rationale for my figures on this thread is slightly different. I reckon there are lots of SQL Server installations where a number of databases are installed on the same fairly low spec SQL Server. Hopefully there is 4GB of RAM and hopefully the o/s switch is set to make 3GB of it available to SQL. Lets assume it's all data cache for the sake of argument.
I think there is a good chance that contention for that 3GB will reduce lifetime pages spend in cache. So if working on a 30% limit as suggested, you can only get 2GB of effective data into the cache together with 1GB of effectively unused space (when averaged across all the pages)
Then you get thrashing and more regular I/O requirements (although the I/O response time for each requirement may not be significantly degraded).
So mine is more a cache efficiency point that a pure I/O point.
.
March 18, 2009 at 4:26 am
Tim Walker (3/18/2009)
I think there is a good chance that contention for that 3GB will reduce lifetime pages spend in cache. So if working on a 30% limit as suggested, you can only get 2GB of effective data into the cache together with 1GB of effectively unused space (when averaged across all the pages)
Could you explain why you say that?
When I talk about 30%, I'm talking about average fragmentation being 30%, not average space free per page.
So mine is more a cache efficiency point that a pure I/O point.
Fragmentation has nothing to do with cache. If the storage engine reads a table into cache it reads in just the pages requested, regardless of how where those pages are on disk. It won't accidentally read in other pages that are mixed up with the ones it wants.
I can see your logic if you were talking about the average % free per page, but not average logical fragmentation in %.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
March 18, 2009 at 4:48 am
GilaMonster (3/18/2009)
Tim Walker (3/18/2009)
I think there is a good chance that contention for that 3GB will reduce lifetime pages spend in cache. So if working on a 30% limit as suggested, you can only get 2GB of effective data into the cache together with 1GB of effectively unused space (when averaged across all the pages)Could you explain why you say that?
When I talk about 30%, I'm talking about average fragmentation being 30%, not average space free per page.
So mine is more a cache efficiency point that a pure I/O point.
Fragmentation has nothing to do with cache. If the storage engine reads a table into cache it reads in just the pages requested, regardless of how where those pages are on disk. It won't accidentally read in other pages that are mixed up with the ones it wants.
I can see your logic if you were talking about the average % free per page, but not average logical fragmentation in %.
You're right of course, my explanation was fundamentally flawed in that respect!
I am talking about free space per page because I think they occur together most of the time. E.g. Insert causes page split, new page is created out of order, fragmentation is increased, both old and new page are now 50% full.
I suppose you might say that I should check free space per page instead but I think I'd get pretty much the same result.
I think my point still stands, it was just very badly explained the first time!
.
March 18, 2009 at 4:57 am
Tim Walker (3/18/2009)
I suppose you might say that I should check free space per page instead but I think I'd get pretty much the same result.
Not necessarily. It's very possible to get high free space per page with very low fragmentation (after a rebuild with a low fill factor) or high fragmentation with low free space per page (after say a database shrink)
The 1000 page recommendation is based on IO impact of logical fragmentation, so if the free space is more of a concern for you, base your rebuild scripts on the free space per page and a different page count.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
March 18, 2009 at 5:05 am
Double post. Never mind.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
March 18, 2009 at 7:11 am
nice discution....
Now i'm "optimizing" my optimization.:-)
That's why I have some more questions 😀
I found that the indexes with more than 1000 pages were rebuild, and I have some others that have more than 30% fragmentation but they have less than 1000 pages. In my result set of my query I have some heaps that have lots of pages(90 000) and 99% fragmentation. What should I do with them?Are they a problem?Or this is normal?
And a second question...what steps should I folow? Now I'm thinking to update the statistics...for the columns that aren't in indexes....and I don't know what to de next...
Pls help:-)
Thanks for all the responses...I know it will help lots of readers...
March 18, 2009 at 7:50 am
I take Gails point but I'd still say rebuild those indexes at greater than 30% fragmentation even if less than 1000 pages. This won't make a huge difference though either way.
Your heaps may be more of a worry. Fragmentation figures for a heap are to do with extents and don't mean much. What you need watch here is sys.dm_db_index_physical_stats avg_page_space_used_in_percent, because when rows are deleted from heaps the space is not automatically reclaimed.
For this reason I think its normally best to have a clustered index on a table unless you have a good reason not to.
If you don't want a clustered index permanently, creating and dropping one again will defragment the heap.
In 2005 use the ALTER INDEX REBUILD command to rebuild the whole index. This is an offline operation unless you have enterprise edition. Creates transaction logging a little bigger than the size of the table normally unless you have a non unique clustered index which can be a lot more.
You can also use ALTER INDEX REORGANIZE, which is an online operation in all editions but this is less effective and normally creates a very large transaction log.
Statistics are rebuilt automatically when you rebuild an index, to keep them up to date otherwise use either the auto update statistics option or run sp_updatestats on a schedule to control exactly when it is done.
.
March 18, 2009 at 7:51 am
GilaMonster (3/18/2009)
Tim Walker (3/18/2009)
I suppose you might say that I should check free space per page instead but I think I'd get pretty much the same result.Not necessarily. It's very possible to get high free space per page with very low fragmentation (after a rebuild with a low fill factor) or high fragmentation with low free space per page (after say a database shrink)
The 1000 page recommendation is based on IO impact of logical fragmentation, so if the free space is more of a concern for you, base your rebuild scripts on the free space per page and a different page count.
Good points, thanks for the feedback, particularly where the 1000 pages figure came from...
.
March 18, 2009 at 8:03 am
First, as you can see in myreply with the code, I rebuild all the indexes. For some, as I said before, I don't know why but they still have a > 10% avg_fragmentation, even 90%, but the number of pages is < 1000, so I think is ok.
In this column : sys.dm_db_index_physical_stats avg_page_space_used_in_percent I have only NULL values... so I can't use this information.
And for the tables with heaps...I have indexes on them, of course clustered, and for some unclustered also.
Should I drop and create them again to see if I get rid of the heap?
Thanks for the reply
March 18, 2009 at 8:14 am
shnex (3/18/2009)
I have some heaps that have lots of pages(90 000) and 99% fragmentation. What should I do with them?
There's nothing you can do with them. Heaps cannot be rebuild. The fragmentation value given for them isn't the same as for an index. The avg % fragmentation for heaps is extent fragmentation (since pages in a heap have no order, there's no meaning to the concept 'logical fragmentation').
Like with logical fragmentation, the extent fragmentation is only a problem with large scans of the table
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
March 18, 2009 at 8:28 am
Have a look at
SELECT * FROM sys.dm_db_index_physical_stats (DB_ID(),NULL,NULL,NULL,'SAMPLED')
The 'Sampled' parameter will populate avg_page_space_used_in_percent (and te other columns) by sampling the underlying pages. You can use DETAILED instead of SAMPLED to read all of them but it takes longer.
Once you have this info you'll be able to see how you can change your code to check the fragmentation before deciding which tables / indexes to rebuild.
A table can't be both a heap and a clustered index. If there is a clustered index the rebuild will arrange the data in the same order as the index. If it is a heap, nothing will happen because the data has no order. You'll be able to see this in the query above as well.
.
March 18, 2009 at 8:35 am
Do you guys have a good query to use to check for the % of frag, and # of pages?
I'm assuming you're checking the dmv for index_stats correct?
March 18, 2009 at 8:58 am
Man ...I agree with you, but when I run the query against sys.dm_db_index_physical_stats I receive an object_id, and object_name(object_id) returns a table who has for instance 2 indexes,1 clustered, and 1 nonclustered.Maybe I'm not understanding corect the results.But I believe that is ok what I say.
So Gila,it isn't a problem if I leave the heaps this way? or should I try in one way to remove them and have an index? if I drop and create the existing index on that table, the heap should disapear?
March 18, 2009 at 9:09 am
shnex (3/18/2009)
if I drop and create the existing index on that table, the heap should disapear?
I'm not sure we're exactly clear on the definition of a heap.
A heap is a table that does not have a clustered index. If you have a heap and add a clustered index to it, it's no longer a heap. If you remove the clustered index, it reverts to being a heap.
I receive an object_id, and object_name(object_id) returns a table who has for instance 2 indexes,1 clustered, and 1 nonclustered
That's not a heap.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
March 18, 2009 at 9:27 am
You should create clustered indexes on all your tables. SQL Server is written to somewhat expect these. If it's a small table, pick a PK field. If it's larger, think about it a bit. choose a field that groups data according to how you query it often.
Viewing 15 posts - 16 through 30 (of 64 total)
You must be logged in to reply to this topic. Login to reply