Index reorg taking too much time.

  • Hi,

    We have a scheduled task that executes a stored proc that does few database maintenance tasks. One of which is index reorg. I've observed whenever we do stress\load testing, the command is timing out. So I increased the timeout, added some extra logging and got some numbers. The index reorg is taking lot of time, anywhere from 1 hours to 8-9 hours.

    My questions are:

    1) Is this normal? Just increasing the timeout is the solution?

    2) How is this done normally? batching etc.

    We replicate data to all the client machines. The other tasks that I do in this SP are (in order)

    1) Take backup (only on server)

    2) DBCC checkdb

    3) Index Reorg

    4) Update stats

    5) Checkpoint (only on clients)

    Note: I cannot use Index Rebuild because I need to run this on SQLEXPRESS as well.

    Thanks in advance.

  • The best advice anyone can give about whether the time spent on the index reorg is normal is to take measurements and see what you normally get. The answer about how long this should take will depend on many factors, from the hardware you are running to the size of the data you are working with.

    There are some very good maintenance scripts you can find from Ola Hallengren at http://ola.hallengren.com/

    Lately, I've heard that you shouldn't even bother with an index reorg or rebuild. An index that has little or no fragmentation is great for doing sequential reads. If your data is on a SAN, then the underlying hardware will pretty much never have a chance to do sequential reads on your data files. The data files are probably striped over several different drives, there will be other stuff going on that the SAN is trying to serve up, and you will really not improve performance because the SAN will be accessing the data files in a random manner anyway.

    If you have your data in direct attached storage and it's dedicated to your server, that's a different story, and the index maintenance might help.

    What will definitely help in either case is to make sure you're doing your statistics maintenance.

    If you have your heart set on doing the index reorg, the general starting point is that you don't have to worry about the index if you have less than 30% fragmentation. Ola's scripts automatically take this into account, although you may have to modify them to not try a rebuild if that isn't available in SQL Express. I'm not sure what it would do, I've never tried it.

    If setting the threshold for only messing with an index if the fragmentation is higher than a certain percent (30%, maybe), then you could make a maintenance table for all the tables you want to reorg and have a column for the day it should be reorganized. Split the list into 7 parts and then write a script that does the reorg based on what tables are due for that day.

  • stevenb 63624 (10/31/2013)


    The best advice anyone can give about whether the time spent on the index reorg is normal is to take measurements and see what you normally get. The answer about how long this should take will depend on many factors, from the hardware you are running to the size of the data you are working with.

    There are some very good maintenance scripts you can find from Ola Hallengren at http://ola.hallengren.com/

    Lately, I've heard that you shouldn't even bother with an index reorg or rebuild. An index that has little or no fragmentation is great for doing sequential reads. If your data is on a SAN, then the underlying hardware will pretty much never have a chance to do sequential reads on your data files. The data files are probably striped over several different drives, there will be other stuff going on that the SAN is trying to serve up, and you will really not improve performance because the SAN will be accessing the data files in a random manner anyway.

    If you have your data in direct attached storage and it's dedicated to your server, that's a different story, and the index maintenance might help.

    What will definitely help in either case is to make sure you're doing your statistics maintenance.

    If you have your heart set on doing the index reorg, the general starting point is that you don't have to worry about the index if you have less than 30% fragmentation. Ola's scripts automatically take this into account, although you may have to modify them to not try a rebuild if that isn't available in SQL Express. I'm not sure what it would do, I've never tried it.

    If setting the threshold for only messing with an index if the fragmentation is higher than a certain percent (30%, maybe), then you could make a maintenance table for all the tables you want to reorg and have a column for the day it should be reorganized. Split the list into 7 parts and then write a script that does the reorg based on what tables are due for that day.

    Thanks for the reply. Here are the answers to your questions:

    1) I've a harddisk attached to server and clients. So index maintenance is necessary. Otherwise queries get slower (scans instead of seeks)

    2) I'm using threshold for reorganizing only the fragmented indexes.

  • Having the query engine choose a scan instead of a seek isn't going to be affected by how fragmented your index is. An index, by definition, is the ordering of the records. Fragmentation is when the physical order of the index is different from the logical order of the index.

    If you get scans instead of seeks after a period of time, I would be looking at statistics maintenance. The statistics tell the query engine what the distribution of data is and what is likely to be the fastest way to get at the data.

    Just for a poor analogy, think of the table like a book shelf. The index is going to be how you organize the books. Alphabetical by author last name, alphabetical by title, ordered by color... whatever. If you have a lot of index fragmentation in an index that's sorted alphabetically by title. What happens is that you look at a book and it says "The next book after me is located at location X". If the index is fragmented, X might be 2 shelves down. You know where the book is, but you still have to move to get to it. If the index is not fragmented, then X will be one book to the right and you just move right along.

    What happens when you get out of date statistics is that you say "I'm looking for all the books that start with the letter A". The query engine says "Well, I've only got 1000 books in the shelf and books that start with A makes up 50 % of the books I have. It's easier to just scan the table than it is to grab each individual book that starts with A." In reality, you have 5000 books, and now books that start with A are only 10% of the books on the shelf. It should have done the seek, but it didn't know that until after the query took way too long to run.

    The point is, index fragmentation won't cause scans, it will just make retrieving records from disk slightly slower. If you have scans where there should be seeks, then you need to look at statistics, and look at your query execution plan to see if you can tell why it is ignoring indexes you think it should be using.

  • stevenb 63624 (10/31/2013)


    Having the query engine choose a scan instead of a seek isn't going to be affected by how fragmented your index is. An index, by definition, is the ordering of the records. Fragmentation is when the physical order of the index is different from the logical order of the index.

    If you get scans instead of seeks after a period of time, I would be looking at statistics maintenance. The statistics tell the query engine what the distribution of data is and what is likely to be the fastest way to get at the data.

    Just for a poor analogy, think of the table like a book shelf. The index is going to be how you organize the books. Alphabetical by author last name, alphabetical by title, ordered by color... whatever. If you have a lot of index fragmentation in an index that's sorted alphabetically by title. What happens is that you look at a book and it says "The next book after me is located at location X". If the index is fragmented, X might be 2 shelves down. You know where the book is, but you still have to move to get to it. If the index is not fragmented, then X will be one book to the right and you just move right along.

    What happens when you get out of date statistics is that you say "I'm looking for all the books that start with the letter A". The query engine says "Well, I've only got 1000 books in the shelf and books that start with A makes up 50 % of the books I have. It's easier to just scan the table than it is to grab each individual book that starts with A." In reality, you have 5000 books, and now books that start with A are only 10% the books on the shelf. It should have done the seek, but it didn't know that until after the query took way too long to run.

    The point is, index fragmentation won't cause scans, it will just make retrieving records from disk slightly slower. If you have scans where there should be seeks, then you need to look at statistics, and look at your query execution plan to see if you can tell why it is ignoring indexes you think it should be using.

    Thanks again. Yeah my bad. It did scans because statistics weren't updated as it was after the Reorg. and command timed out on Reorgs.

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply