Question regarding Index Fragmentation

  • Hi all,

    I'm using the above system function to get back some index utilization statistics, to help me determine whether or not an index requires rebuilding/reorganizing.

    I'm encountering two situations which I'm not sure how to explain.

    One is that, despite rebuilding an index, it still remains "fragmented", and will not change. Now, I think I've read about the explanation for this, and it is due to the fact that the table contains very few records, and thus that fragmentation doesn't really mean anything. But I have some tables with several thousand rows, perhaps even into the tens of thousands, for which the indexes still remain fragmented after rebuilding. One has a fragmentation of 25% with 14000 rows, while another has a 50% fragmentation with 5000 rows. Is this normal, and is it a problem that needs to be addressed?

    The other situation is regarding an index which has several index levels. From what I know, indexes split once they reach a certain threshold of nodes, and once that number of nodes exceeds a certain level, they move down to another row. This makes sense to me - but, what doens't make sense is that, despite rebuilding the indexes, the fragmentation still exists. It seems like the bottom level has the highest fragmentation, with the level of fragmentation going down as the level goes up; the topmost level is not fragmented at all. Is this normal? See below for more info.

    ObjectNameIndexNamePercentFragmentindex_depthindex_levelavg_fragmentation_in_percentfragment_countavg_fragment_size_in_pagespage_countavg_page_space_used_in_percentrecord_count

    ct_TrackingUserClickIND_DateClicked_Clicks_NewsletterID_AdID66.666666674266.6666666731354.1388683411

    ct_TrackingUserClickIND_DateClicked_Clicks_NewsletterID_AdID21.167883214121.16788321508.2241189.7958117193375

    ct_TrackingUserClickIND_DateClicked_Clicks_NewsletterID_AdID0.716473537400.716473537837111.5579459337499.5884111726887327

  • It's a little hard to see the data breakdown on the columns you have posted, but if I'm reading it right, the most fragmented index only has 3 pages. This is below the threshold where defragmentation should be worried about at all. Note, it's the number of pages that compose the index, not the number of rows. The index with 21% fragmentation has 411 pages. This one is in the low range for index size where fragmentation will affect it. The official Microsoft number is 1000 pages or more for fragmentation to be a factor, but that number is very arbitrary. You should see improvement in the 411 page index, but I'm not surprised that the other isn't improving at all.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Just to confirm, the number of pages below which rebuild/reorg won't do anything is 1,000. Below that size, the whole index is likely to be loaded into RAM, and that means fragmentation won't matter.

    The other thing to do is to look at sys.dm_db_index_usage_stats and see whether the indexes are getting scans. If they don't get very many scans, then fragmentation won't matter. Of course, that data is only since the last time SQL Server was started up, so if that was recently, don't count on that data.

    I've found this query really useful in looking at index data:

    select *

    from sys.dm_db_index_usage_stats as UseStats

    inner join sys.indexes IDX

    on UseStats.object_id = IDX.object_id

    and UseStats.database_id = db_id(db_name())

    and UseStats.index_id = IDX.index_id;

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • GSquared (11/16/2009)


    Just to confirm, the number of pages below which rebuild/reorg won't do anything is 1,000. Below that size, the whole index is likely to be loaded into RAM, and that means fragmentation won't matter.

    The other thing to do is to look at sys.dm_db_index_usage_stats and see whether the indexes are getting scans. If they don't get very many scans, then fragmentation won't matter. Of course, that data is only since the last time SQL Server was started up, so if that was recently, don't count on that data.

    I've found this query really useful in looking at index data:

    select *

    from sys.dm_db_index_usage_stats as UseStats

    inner join sys.indexes IDX

    on UseStats.object_id = IDX.object_id

    and UseStats.database_id = db_id(db_name())

    and UseStats.index_id = IDX.index_id;

    Direct from the horses mouth, the 1000 page number is arbitrary, not based on emperical evidence. However, fewer pages, especially getting down below 500, fragmentation means less & less, and greater number of pages fragmentation is more and more important. That basic rule still applies. Just that number, 1000, isn't exactly meaningful. Other than that, complete, 100% agreement.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • When looking at index fragmentation, it's usually only necessary to look at the leaf level (level 0). The others are very often small and, especially for a frequently used index) in memory most of the time. Since fragmentation is an issue only when reading from the disk, it's not a concern for things frequently cached.

    As for index structure and levels, this may help. http://www.sqlservercentral.com/articles/Indexing/68439/

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Grant Fritchey (11/16/2009)


    Direct from the horses mouth, the 1000 page number is arbitrary, not based on emperical evidence.

    According to Paul, it was an estimate based on knowledge of the storage engine and, upon later testing, turned out to be more or less a reasonable threshold. (same as the 30% rebuild threshold). It's certainly not an absolute, hard and fast number (999 pages don't bother, 1001 pages panic).

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • I take it then, that despite my original intuition, the leaf level is actually index level 0, not the highest index level? In that case, your explanation makes sense, as the leaf level fragmentation is under 1%. Thanks for the explanation!

  • GilaMonster (11/16/2009)


    Grant Fritchey (11/16/2009)


    Direct from the horses mouth, the 1000 page number is arbitrary, not based on emperical evidence.

    According to Paul, it was an estimate based on knowledge of the storage engine and, upon later testing, turned out to be more or less a reasonable threshold. (same as the 30% rebuild threshold)

    But it's not a hard number, just a reasonable starting point.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • kramaswamy (11/16/2009)


    I take it then, that despite my original intuition, the leaf level is actually index level 0, not the highest index level? In that case, your explanation makes sense, as the leaf level fragmentation is under 1%. Thanks for the explanation!

    Yes the leaf level is 0 and the root level is the highest number. In your case 2.

  • kramaswamy (11/16/2009)


    I take it then, that despite my original intuition, the leaf level is actually index level 0, not the highest index level?

    Indeed. The leaf is level 0, the highest level will be the root (with just 1 page in it)

    If you query sys.dm_db_index_physical_stats with the DETAILED option, you'll see all levels, run it with LIMITED and you'll only see the leaf levels.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Alrighty, in that case, how about this one?

    index_type_descindex_levelavg_fragmentation_in_percentfragment_countavg_fragment_size_in_pagespage_count

    CLUSTERED INDEX099.250381832350531235053

    This thing was done using the LIMITED option, and it does seem to indicate the level 0 leaf node fragmentation, saying it is 99% fragmented. However, when I right-click on the index and go to the properties screen, then look at the Fragmentation tab, it says that the Total fragmentation is 0.00%. Which one is correct?

  • I'd say 99%. The fragment size is 1, indicating there are no two contiguous pages, and the number of fragments is the same as the number of pages. This index looks like it was rebuilt and then 'reversed' by a file or DB shrink operation.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Hm. Well, I just did a rebuild on it, and the fragmentation remains at 99% from the query. Any other ideas?

  • Do you have auto_shrink on?

    Can you run the query with the detailed option and post it so that it's easily readable. If you look at the earlier posts, it's hard to match column names to values.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • I don't think auto shrink is on, how can I find out though?

    As for the information, I'm not sure how to post it here in a readable fashion, so I'm going to attach it.

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply