Huge Index growth

  • Hi All

    In the last month, I have noticed massive growth on one of the tables in my database. The total indexes went from 10GB to 70GB. This is unusual for this system as the indexes were always at a steady +-10GB in size

    I have confirmed with the App team that it is to be expected for this time and I can expect even more growth

    The current fill factor is 0, will changing the fill factor to something like 50% help with controlling the growth on this table?

    This table is not queried from nearly as much as it is inserted into.

    Getting downtime to rebuild the indexes is not easy so I will have to make sure that when I do get the time to rebuild the indexes, I make the right call.

    I am almost sure that if I rebuild the indexes and specify a fill factor of 50% will double the size of my indexes but at the same time will keep it from growing again, at least for now.

    Is this right?

    Thanks

  • hi there,

    with the appropriate version of SQLServer you can do the rebuild with the ONLINE-option so there would not be any downtime. Otherwise use the REORGANIZE-command instead of the REBUILD.

    REORGANIZE is much slower then REBUILD but is an online option too. Reorganizing also makes your data file grow less than rebuilding a huge index.

    After defragmentation of your indexes, which should be done on a regular basis, the size might be reduced dramatically.

    Of course, even the online index operations do have impact on your system as sooner or later there will be some locks on your table. I think the only way is: try it and monitor it.

    Using another fill factor will of course make your table or index even larger. If it is the double by using 50 % fill factor, I don't know if it is exactly that and my experience using different fill factors is quite small...

    The advantage of using a higher fill factor is that it MIGHT reduce index fragmentation as there might be less page splits. Another thing to think of is how the data is inserted: what is the clustered index and ist the data inserted sequentially on the key column(s)? If yes changing the fill factor might not have any impact at all. Another question: Are there updates on existing records? and so on...

    My recommendation: try the index reorganize or online-rebuild command and check how it impacts your system performance and index size.

  • WolfgangE (12/19/2012)


    hi there,

    with the appropriate version of SQLServer you can do the rebuild with the ONLINE-option so there would not be any downtime. Otherwise use the REORGANIZE-command instead of the REBUILD.

    REORGANIZE is much slower then REBUILD but is an online option too. Reorganizing also makes your data file grow less than rebuilding a huge index.

    After defragmentation of your indexes, which should be done on a regular basis, the size might be reduced dramatically.

    Of course, even the online index operations do have impact on your system as sooner or later there will be some locks on your table. I think the only way is: try it and monitor it.

    Using another fill factor will of course make your table or index even larger. If it is the double by using 50 % fill factor, I don't know if it is exactly that and my experience using different fill factors is quite small...

    The advantage of using a higher fill factor is that it MIGHT reduce index fragmentation as there might be less page splits. Another thing to think of is how the data is inserted: what is the clustered index and ist the data inserted sequentially on the key column(s)? If yes changing the fill factor might not have any impact at all. Another question: Are there updates on existing records? and so on...

    My recommendation: try the index reorganize or online-rebuild command and check how it impacts your system performance and index size.

    Clustere Index is on Primary Key Identity Column.

  • Choose your fill factor with performance in mind, not disk space. If disk space is the absolute priority, choose 0/100.

    As I think you have realised, you must find time to rebuild your indexes otherwise performance will suffer and disk usage will grow. If you have Enterprise Edition, you could consider using online index rebuilds, but take into account the effect that will have on tempdb.

    John

  • Clustere Index is on Primary Key Identity Column.

    In this case the clustered index should hardly have fragmentation. Only if there are many updates on existing records there might be page splits.

    If you can avoid a reorg of the clustered index and thus only have to reorg the nonclustered indexes there should not be massive impact on your table while doing so.

  • @SQLSACT please answer WolfgangE re: whether the table accepts updates. Also, does the table accept deletes? You could be seeing a spreading out of the index on disk due to page splits from updates. That coupled with the case of many new inserts, and deletes of older rows where the page is not completely emptied by the delete. What is the current page density of the index?

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Hi Guys

    As far as I know, the table is used as an event logging table and is used only used for inserts

    Also, I have rebuilt all the indexes (23 in total) and the space used by this table was dramatically decreased, to 20GB.

    sidebar When analyzing indexes using sys.dm_db_index_operational_stats. Why does an update statement show an increase in the leaf_update_count column on the Clustered Index and shows an increase in the leaf_insert_count of the nonclustered indexes?

    Do update statements cause an Insert on the nonclustered indexes?

    Thanks

  • hi SQLSACT,

    23 Indexes seems to very very very much for a simple log table. I personally think it's very much for any table. You should check the sys.dm_db_index_usage_stats() too. When checking don't forget that these statistics are resetted when the sqlserver get's restartet.

    Of course an update statement can affect the nonclustered indexes. If you update the key column of an index the entry in the index has to updated to and sorted in the new, correct position.

    If you have filtered indexes and the new value matches the filter criteria of the index there has to an insert in the index.

    Greetings, Wolf

  • WolfgangE (12/21/2012)


    hi SQLSACT,

    23 Indexes seems to very very very much for a simple log table. I personally think it's very much for any table. You should check the sys.dm_db_index_usage_stats() too. When checking don't forget that these statistics are resetted when the sqlserver get's restartet.

    Of course an update statement can affect the nonclustered indexes. If you update the key column of an index the entry in the index has to updated to and sorted in the new, correct position.

    If you have filtered indexes and the new value matches the filter criteria of the index there has to an insert in the index.

    Greetings, Wolf

    Thanks Wolfgang

    Of course an update statement can affect the nonclustered indexes. If you update the key column of an index the entry in the index has to updated to and sorted in the new, correct position.

    If you have filtered indexes and the new value matches the filter criteria of the index there has to an insert in the index.

    I understand that all Indexes including Nonclustered indexes will be affected by updates

    What I meant is that when I run an update statement, sys.dm_db_index_operational_stats shows that an update happened on the clustered index and that an insert happened on the nonclustered index

    I think I figured out why this happens though - Looking at the transaction log using fn_dblog, I see LOP_MODIFY_ROW operations on the clustered index, LOP_DELETE_ROWS on the nonclustered index followed by LOP_INSERT_ROWS on the nonclustered index. It seems that SQL updates nonclustered by deleting the old value and then inserting the new value. I wonder if this is always the case for nonclustered indexes??

    Regarding the indexes, I know for a fact that most of them have never been queried from but have been inserted into. There are duplicate indexes, there are indexes with the same leading column (not sure if this is a problem), also there are nonclustered indexes where the leading column is the same as the clustering key (not sure if this is a problem).

    Thanks

  • SQLSACT (12/18/2012)


    I have confirmed with the App team that it is to be expected for this time and I can expect even more growth

    if this is expected and confirmed then growth should not be a suspect here , data will grow.... its in nature. another thing if , as you say that there are less reads as compare to insert then having 23 indexes there only putting extra overhead in terms of both space and DML operation (here insert) as other people suggested here that use dm_db_index_usage and wipe out all indexes those are not in use.

    -------Bhuvnesh----------
    I work only to learn Sql Server...though my company pays me for getting their stuff done;-)

  • SQLSACT (12/21/2012)


    I think I figured out why this happens though - Looking at the transaction log using fn_dblog, I see LOP_MODIFY_ROW operations on the clustered index, LOP_DELETE_ROWS on the nonclustered index followed by LOP_INSERT_ROWS on the nonclustered index. It seems that SQL updates nonclustered by deleting the old value and then inserting the new value. I wonder if this is always the case for nonclustered indexes??

    It is always a delete followed by an insert in 2008 although both operations might be done to the same page. Not a true in-place update but still pretty efficient. More info: Do changes to index keys really do in-place updates? by Paul Randal

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • opc.three (12/21/2012)


    SQLSACT (12/21/2012)


    I think I figured out why this happens though - Looking at the transaction log using fn_dblog, I see LOP_MODIFY_ROW operations on the clustered index, LOP_DELETE_ROWS on the nonclustered index followed by LOP_INSERT_ROWS on the nonclustered index. It seems that SQL updates nonclustered by deleting the old value and then inserting the new value. I wonder if this is always the case for nonclustered indexes??

    It is always a delete followed by an insert in 2008 although both operations might be done to the same page. Not a true in-place update but still pretty efficient. More info: Do changes to index keys really do in-place updates? by Paul Randal

    Thanks

    Doesn't this make it challenging when interrogating your indexes for number of inserts

    When I check my indexes, I want to be able to see how many inserts ran against it. Now I have to keep in mind that if there are 10000 inserts on my index, some of those could be from an update that was split into a delete followed by an insert.

    Shouldn't the leaf_update_count column be removed then?

    Thanks

  • leaf_update_count and leaf_insert_count appear to track logical update and insert operations correctly on a clustered index but not how I would expect for a nonclustered index.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply