Alter Index... Sort_In_Tempdb On

  • This isn't a major issue just something I've been researching. Here's some background on my situation.

    I have an admin database that I use to track performance and other items of interest. There is one table of that tracks memory usage and has 6 fields. I am using 2 fields to make up a clustered index with approximately 310,000 records. The data file is 500mb and the log file is 279mb. The log usage is normally around 1% (3mb). My issue arises when indexes are evaluated on weekends and are rebuilt/reorganized accordingly. When the index is rebuilt, the log spikes to 99% and grows if necessary.

    After doing some research, I found out why it happens. (not the best explanation)When a clustered index is rebuilt it is broken down, sorted, and rebuilt as a second index. Only then is the original index dropped. This operation is fully logged thus blowing up the transaction log. I've come across a few solutions for this issue.

    -First is obviously to leave the log as big as it is (or as big as it needs to be). This is the easiest but makes me feel the log is just wasted space for the other 98% of the time. I'm not too interested in shrinking the log every week.

    -Second is to use the SORT_IN_TEMPDB ON option during the rebuild operation. This is what interests me and mainly why I am writing this post. As far as I know, the sorting would take place in the tempdb database and be logged in its transaction log, relieving the admin database of such duties. If all conditions were met, this is ideal for performance but my main concern is not letting disk space get out of hand. The downside is the transference of used space from the admin database to tempdb therefore not really reducing the overall footprint of the instance.

    -Third is to change the recovery model to bulk logged just before the rebuild operation and back to full afterwards. This will minimally log the rebuild operation thus reducing the amount of log growth.

    Does anyone have any real world experience or opinions on these matters?

    references -

    Transaction Log Disk Space for Index Operations

    http://msdn.microsoft.com/en-us/library/ms184246.aspx

    Disk Space Requirements for Index DDL Operations

    http://msdn.microsoft.com/en-us/library/ms179542.aspx

    Choosing a Recovery Model for Index Operations

    http://msdn.microsoft.com/en-us/library/ms191484(v=SQL.105).aspx

    ______________________________________________________________________________________________
    Forum posting etiquette.[/url] Get your answers faster.

  • calvo (1/4/2011)


    This isn't a major issue just something I've been researching. Here's some background on my situation.

    I have an admin database that I use to track performance and other items of interest. There is one table of that tracks memory usage and has 6 fields. I am using 2 fields to make up a clustered index with approximately 310,000 records. The data file is 500mb and the log file is 279mb. The log usage is normally around 1% (3mb). My issue arises when indexes are evaluated on weekends and are rebuilt/reorganized accordingly. When the index is rebuilt, the log spikes to 99% and grows if necessary.

    After doing some research, I found out why it happens. (not the best explanation)When a clustered index is rebuilt it is broken down, sorted, and rebuilt as a second index. Only then is the original index dropped. This operation is fully logged thus blowing up the transaction log. I've come across a few solutions for this issue.

    -First is obviously to leave the log as big as it is (or as big as it needs to be). This is the easiest but makes me feel the log is just wasted space for the other 98% of the time. I'm not too interested in shrinking the log every week.

    -Second is to use the SORT_IN_TEMPDB ON option during the rebuild operation. This is what interests me and mainly why I am writing this post. As far as I know, the sorting would take place in the tempdb database and be logged in its transaction log, relieving the admin database of such duties. If all conditions were met, this is ideal for performance but my main concern is not letting disk space get out of hand. The downside is the transference of used space from the admin database to tempdb therefore not really reducing the overall footprint of the instance.

    -Third is to change the recovery model to bulk logged just before the rebuild operation and back to full afterwards. This will minimally log the rebuild operation thus reducing the amount of log growth.

    Does anyone have any real world experience or opinions on these matters?

    references -

    Transaction Log Disk Space for Index Operations

    http://msdn.microsoft.com/en-us/library/ms184246.aspx

    Disk Space Requirements for Index DDL Operations

    http://msdn.microsoft.com/en-us/library/ms179542.aspx

    Choosing a Recovery Model for Index Operations

    http://msdn.microsoft.com/en-us/library/ms191484(v=SQL.105).aspx

    1) why in the world are you concerned about a 279MEGABYTE transaction log?? If your disk space is that tight you have some very serious issues to deal with.

    2) sort in tempdb is a wonderful thing for numerous reasons such as being able to split IO usage, avoiding tlog us in primary db, better contiguousness of data layout potentially

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • TheSQLGuru (1/4/2011)


    1) why in the world are you concerned about a 279MEGABYTE transaction log?? If your disk space is that tight you have some very serious issues to deal with.

    2) sort in tempdb is a wonderful thing for numerous reasons such as being able to split IO usage, avoiding tlog us in primary db, better contiguousness of data layout potentially

    You're right. a 279mb log isn't big time but as I said, it's not an issue right now. I was interested in how others might have handled the same situation. Perhaps I could learn from their experience and leverage that if this situation were to ever arise in a more important production database.

    ______________________________________________________________________________________________
    Forum posting etiquette.[/url] Get your answers faster.

  • I recently switched our index rebuild scripts to use tempdb for the sort runs instead of the database containing the index. I had the same issue where the transaction log was ballooning to about half the size of the database during the nightly rebuild task. With a 500 GB database and a Neverfail implementation all that excess log file I/O was definitely becoming a headache.

    Switching to tempdb resolved the excessive log growth issue. The amount of time it takes to perform the reindexing also seems to have improved. The only catch is that you have to make sure have sufficient space on your tempdb disk set to accommodate the amount of index data it will store during the sorting. With the size of your database now you really don't have to worry but I would definitely recommend using this (or at least testing the performance) as you manage larger DBs.

    Hope this helps.

  • Thanks Chris,

    I was thinking more about it last night (dreaming of sql server? i need a break) and thought of an additional caveat to using sort_in_tempdb. When I rebuild indexes, I loop through each database and rebuild/reorg as needed. Sorting in the tempdb would allow me to only have one large log file and not multiple large log files, one for each database with index operations.

    I'm implementing the sort_in_tempdb this weekend in development. we'll see how it goes.

    ______________________________________________________________________________________________
    Forum posting etiquette.[/url] Get your answers faster.

  • Good luck with the testing. Don't forget to size tempdb appropriately and not wait for autogrowths to size it for you.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • TheSQLGuru (1/7/2011)


    Good luck with the testing. Don't forget to size tempdb appropriately and not wait for autogrowths to size it for you.

    Yep. It's happening Saturday night and there's no better way I could think of spending my Saturday night than watching transaction logs grow.

    I set the tempdb log equivalent to the largest log growth for the admin database and will keep an eye on usage during the maintenance.

    ______________________________________________________________________________________________
    Forum posting etiquette.[/url] Get your answers faster.

  • Alright, it didn't go as expected. I set sort_in_tempdb = on and didn't get the results i expected. Looking over the logs, the index was reorganized and not rebuilt. This got me thinking about performance so I set up a test.

    I have two tables, sourcedata and test, each with a clustered index.

    Here are the steps I would take before each round:

    -insert 500k records from test.sourcedata into test.test

    -insert 500k records into test.test to create fragmentation (99% and 3300 pages)

    -truncate transaction log for more accurate results

    I would note the log sizes of tempdb and the test database after each truncate and again after each index operation. Here is what I got:

    inserted 500k (total 1mil) and truncate log

    --testdb 3.7%

    --tempdb 9.5%

    rebuild sort in tempdb on

    --testdb 8.5%

    --tempdb 9.5%

    inserted 500k (total 1mil) and truncate log

    --testdb 0.82%

    --tempdb 9.5%

    rebuild sort in tempdb off

    --testdb 5.6%

    --tempdb 9.6%

    inserted 500k (total 1mil) and truncate log

    --testdb 4.99%

    --tempdb 9.5%

    reorganize

    --testdb 29.7%

    --tempdb 9.5%

    What gets me here is that reorganize had more of an effect on the log than a rebuild operation. It could be my own misunderstanding, but I thought a rebuild was more log intensive. I'm also unsure why the sort_in_tempdb = on didn't result in a larger tempdb log file, the testdb log actually grew more.

    ______________________________________________________________________________________________
    Forum posting etiquette.[/url] Get your answers faster.

  • Hi,

    I am doing rebuilds/reorgs on the live database in our company and have the same issues/things seen and a few ideas how it is working.

    Sort in TempDB I have not used at the moment but normally I don't rebuild/reorg clustered indexes because they normally don't tend do fragment that much so I do them manually once a while.

    In this document http://technet.microsoft.com/en-us/library/cc966402.aspx you have quite a lot of information about space usage and where, maybe there is the explanation you are looking for.

    You truncated/ shrinked the log so you have gone back to, lets say 20MB every time?

    Have you done this rebuilds with online=on or off?

    BR Christoph

  • Christoph,

    Yes, each round I would truncate the log before introducing fragmentation. The rebuild was happening offline (default).

    ______________________________________________________________________________________________
    Forum posting etiquette.[/url] Get your answers faster.

  • I made a few tests with 1608 pages of data (+10 indexpages) , one int column only, 1 mio rows.

    The TempDB did not grow but the datafile is bigger than the table I have made my tests with. The rest is interresting because it does not count up with your numbers as well as my numbers in the live-system.

    Rebuild (sort in tempDb)

    Datafilegrowth 3.200 (70%)

    Logfilegrowth 1.816

    Rebuild (don't sort in tempDb)

    Datafilegrowth 3.200 (70%)

    Logfilegrowth 1.816

    Reorg

    Datafilegrowth 00 (0%)

    Logfilegrowth 1.576

    But it at least proves one thing as far as I interpret these numbers. He does not sort in tempdb because there is nothing to sort, the data is already in the correct order (because it is in an index and ordered).

    The only thing here is that I normally prefer Rebuild with online=on to reorg because I had a few issues where the reorg ran quite a long time. And the logfile-usage was about 5-6 times the size of the index so I changed the script that the big indexes are rebuild instead. It I guess it costs more performance but with mirroring and replication running a fast growing transactionlog has its backdraws.

    BR Christoph

  • I did some more testing, focusing on reorganize since it is the culprit of the log growth. I ran the same test sequence four times for different fill factors. Here are my results.

    -------------------------

    reorganize fill factor 90%

    -------------------------

    before 4.9%

    after 28.1%

    before 4.9%

    after 28.1%

    before 1.0%

    after 24.1%

    before 1.4%

    after 24.7%

    -------------------------

    reorganize fill factor 80%

    -------------------------

    before 3.1%

    after 8.6%

    before 3.0%

    after 22.6%

    before 1.1%

    after 20.5%

    before 0.4%

    after 20.0%

    -------------------------

    reorganize fill factor 70%

    -------------------------

    before .5%

    after 15.6%

    before 7.2%

    after 22.4%

    before .5%

    after 15.5%

    before .9%

    after 16%

    -------------------------

    reorganize fill factor 60%

    -------------------------

    before .8%

    after 12.9%

    before 2.4%

    after 14.2%

    before .9%

    after 12.9%

    before 5.4%

    after 17.4%

    The results show me that the lower the fill factor, the less log space needed to reorganize (makes sense). Leaving extra space on each page might initially create more pages, but less pages needed after inserts are done. This means there are less pages to shuffle around during the reorg operation.

    This makes sense to me in my head, barely. Is anyone else able to explain this in a little more detail?

    I think this is going to be my solution. On the index that bloats the log, I'm going to change the fill factor to 75 or 80 and see how that works this weekend.

    ______________________________________________________________________________________________
    Forum posting etiquette.[/url] Get your answers faster.

  • Seems like a good way to go but keep in mind to do a complete rebuild once in a while, reorganize won't "refresh" the Fillfactor (except if you are deleting too). So if you create an index that is likely to fragment with fillfactor 70, it will work fine till the pages are full.

    I have to take a look at the internal tables again, I couldn't find the fillfactor in a quick sweep over the tables but it should be possible to create a reorg/rebuild-Script depending on the fragmentation as well as the deviation of the current PageUsage vs. FillFactor.

    It seems like it is always the same with fillfactor/rebuild/reorg. There is no easy solution, no "one serves all"-script somewhere and quite a few years of sleepless nights have been spent on this topic

  • Christoph D (1/12/2011)


    Seems like a good way to go but keep in mind to do a complete rebuild once in a while, reorganize won't "refresh" the Fillfactor (except if you are deleting too). So if you create an index that is likely to fragment with fillfactor 70, it will work fine till the pages are full.

    I have to take a look at the internal tables again, I couldn't find the fillfactor in a quick sweep over the tables but it should be possible to create a reorg/rebuild-Script depending on the fragmentation as well as the deviation of the current PageUsage vs. FillFactor.

    It seems like it is always the same with fillfactor/rebuild/reorg. There is no easy solution, no "one serves all"-script somewhere and quite a few years of sleepless nights have been spent on this topic

    I have a similar issue with my database log balooning as a result of overnight reindex/reorg maintenance.

    My database is being mirrored to a DR/reporting environment and the mirror server cannot keep up with the added load of the increased transaction activity due to re-indexing.

    As a result, the mirror breaks down, and I end up with a lot of frustrated users who cannot access the mirroring snapshots on the DR server the next day.

    Does that situation sound familiar to anyone? 🙂

    I'm thinking of leveraging the sort_in_tempdb option to relieve impact on the log of the user db.

    I think this option applies only to re-indexing though, and not to reorg.

    __________________________________________________________________________________
    SQL Server 2016 Columnstore Index Enhancements - System Views for Disk-Based Tables[/url]
    Persisting SQL Server Index-Usage Statistics with MERGE[/url]
    Turbocharge Your Database Maintenance With Service Broker: Part 2[/url]

  • Please post new questions in a new thread.

    Sort in TempDB option is not about log space used. It just means that the 20% or so sort space is allocated from TempDB not the user database.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply