Rebuilding Indexes very often is good?

  • Hi Friends,

    I was about to work on the production server all day last week. I was doing ETL process. Every day i dumped around 2 to 3 millions rows into 4 to 5 tables. At the end of day 1 , there is an application in the front end felt performance issue when searching for the data. One of my colleague advised me to rebuild the indexes on the respective tables. I did and saw some improvement in the performance. The very next day, DBA team automated the prcoess of rebulding the indexes once my ETL is done. So they were keep rebuilding the indexes every day in tables consisting of 15 million rows.

    My question is, Just Curious, whether it is good or bad rebuilding the indexes everyday?

    Any suggestions would be really appreciated. I am just trying to learn from you guys.

  • Rebuilding indexes is a costly operation. But, if it's not interfering with any of your code, running it once a day is not exactly a big deal. I've had systems with serious down time at night (despite being a 24 hour shop, there just wasn't as much business at some points of the day) that allowed us to do it daily on some systems. It didn't hurt anything and it probably helped some. But, that's only if it's not causing resource conflicts. Also, you need to take into account the degree to which your indexes are fragmented and the size of the indexes. Below a certain threshold on both measures, it's not worth the effort.

    So, there's nothing inherently wrong with it, but monitor to be sure.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • All of my production databases are not that large. Because of that fact I rebuild all of the indexes and update all statistics in all of my databases as a nightly process so that it doesn't interfere with day time processing.

    Depending on how large your database is will determine how often and how frequent you would want to manage your indexes & statistics.

    I have a specialized index manager where by I only rebuild indexes if they exceed a threshold degree of fragmentation. My case is 30% of fragmentation. This process is a lot more efficient to run on a larger database only because not all of the indexes become fragmented.

    It is important to manage indexes and statistics on a routine schedule to maximize efficiency on your database.

    Kurt

    Kurt W. Zimmerman
    SR DBA
    Lefrak Organization
    New York, NY

    http://www.linkedin.com/in/kurtwzimmerman

  • Thank you Grant and Kurt...

    I will check on the fragementation. I had no idea when i was asking this question. Now i get an idea on what basis the indexes should be rebuilt...

    Thanks a lot guys...

  • The main thing that slows down index-related performance is fragmentation. You should do some research around this issue to understand more about fragmentation and what an index rebuild does.

    The index rebuild should leave you with indexes that have a standard amount of fragmentation, as determined by the Fill Factor. As you start to do inserts, index performance will actually improve for a time as the free-space pages are used, and then start to deteriorate as index fragmentation begins. Eventually the fragmentation in your index will be worse than it was after you completed your index rebuild, and performance can only get worse.

    You should therefore perform regular index rebuilds to keep your indexes in the optimal state. Not every index in your database will be updated at the same rate, so because index rebuilds are an expensive operation you should work out a way to only rebuild those indexes that need this maintenance.

    The Maintenance Plan process supplied with SQL Server can automate the process of selecting which indexes need maintenance, but there are better (and free) tools available for this. Ola Hallengren's DB maintenance toolkit is one of the best available free tools and gets regular updates to cope with new versions of SQL Server, etc. The commercial products often give more features than the free tools, and all of Red-Gate, Idera and Dell all have excellent products at very reasonable prices.

    When you have an automated method of doing index maintenance, you need to decide how often you want to run it. Many organisations do this at the weekend, but some find they need to do this work daily. You will need to work out what is best for you.

    Original author: https://github.com/SQL-FineBuild/Common/wiki/ 1-click install and best practice configuration of SQL Server 2019, 2017 2016, 2014, 2012, 2008 R2, 2008 and 2005.

    When I give food to the poor they call me a saint. When I ask why they are poor they call me a communist - Archbishop Hélder Câmara

  • Thank You Ed, I am rebuilding and reorgranizing all my indexes to reduce the fragmentation and to increase the page space.

    thanks a lot for guiding me in this....

  • I was recently at a SQL Server user group and a former Microsoft employee that worked on developing SQL Server did a 2 hour talk about indexes. He basically said that if the table design and index design is good, meaning a good fill factor, that you should rarely if almost never have to rebuild indexes. He said if you keep stats updated enough that should be sufficient unless you do a TON of inserts and a TON of deletes. He talked about some large companies that have massive tables that have to be avail 24X7X365 and he said they can't rebuild indexes.... but the key is a good design from the get go.

    I have monitored the fragmentation of some key tables here and have scaled way back rebuild indexes and have Update Stats running weekly and have not seen any performance hit at all with query duration.

  • Good or bad to rebuild all indexes daily...

    I think that it is overkill to unconditionally rebuild.

    I posted this elsewhere, but my recommendation is to get a hold of Michelle Ufford's Index defrag script at http://sqlfool.com/2011/06/index-defrag-script-v4-1/ and implement a job that runs this as a defrag for indexes over 25% fragmented and as a rebuild for indexes over 35% (or other thresholds that make sense for your situation). Michelle's scripts generate a sp that has been bulletproof and highly reliable for us.

    If you have Enterprise edition, you can run this ONLINE ON overnight, and you can also organize how these defrags are run, and throttle the MAXDOP to limit processors involved in the work. I'd also rebuild stats weekly. Other posters correctly gave similar advice.

    Thanks

    John.

  • I'd like to add a special case to the general philosophy here. I worked on a BLOB system that was the majority of storage in the DB, and we had a 'noise' area of maybe 0.05% at the top of a particular table where most of our searching and work was done.

    Because of the combination of these factors, that table was rebuilt every night. This was on SQL 2k, so partitioning was flakey at best, so we had to rebuild the entire billion row table for just a few trailing records (in ratio anyway) to keep the system up to speed. We had a pretty consistent hit rate from our applications in the background so we could tell when we had a query slowdown due to I/O and fragmentation concerns.

    So, while I agree that page sizing and standardization of fragmentation are excellent baselines to work from, I would also recommend that you keep in mind the age old SQL Maxim: "It Depends." Sometimes you just have to beat the thing into submission.

    Caveat: I only thought I was an expert at the time (small pond, big fish...), there may have been much better alternatives but that was a long time ago and I forget the details, and there's much better alternatives available now (partitioning comes to mind).


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Some food for thought...There is a great video discussing why you generally should NOT worry about fragmentation. Instead, you should set fill factor to 100 and try to cache as much data as possible, in which case fragmentation levels do not matter as much.

    http://www.brentozar.com/archive/2013/09/why-index-fragmentation-doesnt-matter-video/

    In a nutshell, if you're constantly defragmenting indexes, you are writing more data to transaction logs, making backups take far longer, and overworking your SAN needlessly. Also, mirroring, and log shipping are going to take longer and use more resources. I have lessened the frequency that I do defragmentations and have seen backup times drop quite a bit.

  • Andrew Andrews (12/31/2013)


    Some food for thought...There is a great video discussing why you generally should NOT worry about fragmentation. Instead, you should set fill factor to 100 and try to cache as much data as possible, in which case fragmentation levels do not matter as much.

    http://www.brentozar.com/archive/2013/09/why-index-fragmentation-doesnt-matter-video/

    In a nutshell, if you're constantly defragmenting indexes, you are writing more data to transaction logs, making backups take far longer, and overworking your SAN needlessly. Also, mirroring, and log shipping are going to take longer and use more resources. I have lessened the frequency that I do defragmentations and have seen backup times drop quite a bit.

    I really agree with a lot of what Brent said in that video but he left out a super critical part. You absolutely should monitor for Average % of page used for all the reasons why he said that fragmentation doesn't matter. Most indexes aren't keyed to respect the insert order of rows and, other than writing good code, indexes play a huge role in fast queries. In order for the indexes to be fast, they also have to be in memory. To be sure, the number of bytes for indexes is frequently much more than the data itself. Unless you have just an absolute shedload of memory, you don't want to waste it with a bunch of indexes that have pages that are only 50% full.

    You also have to watch larger tables pretty closely and take a different tact with them. You can have a whole lot of pages than have 50% empty space in them before they'll start to show up on the Average % of page used column of sys.dm_db_index_physical_stats. For example, if half the pages in an index are only 50% full and the rest are 100% full, then the index is taking up 50% more space than it should be but the AVERAGE will say the pages are 75% full. On a big index that sits in memory, that's a whole lot of memory being wasted. And that's just for one index.

    Brent is absolutely correct. Comparatively speaking, memory is a whole lot cheaper than defragging. But, he's also correct in saying that sometimes buying more memory isn't an option because of budgets or beliefs.

    There's another thing to consider. Restores. If 50% of the space used by your database is empty space and you ever need to do a DR restore to a smaller machine to "get back in business", you might be out of luck because SQL Server is going to restore to the same size as the database that was backed up. If you have, for example, a 200GB database where half of the pages are sitting at 50%, it might not fit in memory and it might not fit on a smaller emergency box. If you do some "crawler" maintenance over time and keep the pages fuller, then the database might actually fit in memory and you might just be able to do a DR restore to a smaller box in a pinch.

    [font="Arial Black"]EDIT.[/font] As a bit of a side bar, I just went through an interesting thing with partitioning. The NCIs (the CI is in "insert order" so it didn't get fragmented much) on the active partition became so fragmented and the Average % of page used got so low that when I rebuilt the indexes, the number of pages used by the partition dropped to 1/3 of what they started out as. These aren't tiny partitions either. Most of the partitions contain 3 or 4 million rows. Don't forget... on these types of tables (it's a W.O.R.M. Audit Table), the part that will be loaded into memory will usually be the active partition.

    So, to answer the question, I'll have to fall back on the answer that is true for all things in SQL Server... "It Depends".

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Why are we not able to rebuild indexes having BLOB data? do we need to drop and recreate in alternate?

  • Andrew Andrews (12/31/2013)


    In a nutshell, if you're constantly defragmenting indexes, you are writing more data to transaction logs, making backups take far longer, and overworking your SAN needlessly. Also, mirroring, and log shipping are going to take longer and use more resources. I have lessened the frequency that I do defragmentations and have seen backup times drop quite a bit.

    I presume you mean log backups, anything that compresses data will speed up full backups and reduce their size.

    ---------------------------------------------------------------------

  • prakashr.r7 (1/2/2014)


    Why are we not able to rebuild indexes having BLOB data? do we need to drop and recreate in alternate?

    You can rebuild indexes that contain BLOB data as INCLUDEs. You just can't rebuild them in an ONLINE fashion prior to SQL Server 2014.

    What error message(s) are you getting?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Hi there - As you're inserting millions of rows into a table, I would strongly recommend dropping the index, inserting the records then re-applying the index once your process in complete. You will find that this will massively reduces the overall processing time and there will be no requirement to rebuild your index.

    ...that said, each environment is different.

    --------------------------------------------

    Laughing in the face of contention...

Viewing 15 posts - 1 through 15 (of 42 total)

You must be logged in to reply to this topic. Login to reply