Index rebuild

Question

Index rebuild

premkuttan

SSCommitted

Points: 1934
More actions
April 27, 2021 at 5:59 pm

#3876454

I have a DB of size 240 GB and when the rebuild index job runs the log file grows upto 300 GB. I do a online reindex activity. I change the DB to bulklogged and trigger the reindex and post reindex i bring it back to FULL. Any idea why the log file grows greater then MDF file.
Scheduled reboot of server happens when the reindex job runs. Can this be a reason for the log file growth.

Viewing 15 posts - 1 through 15 (of 33 total)

You must be logged in to reply to this topic. Login to reply

Jeff Moden SSC Guru Points: 1003863 More actions · Answer 1

premkuttan wrote:

I have a DB of size 240 GB and when the rebuild index job runs the log file grows upto 300 GB. I do a online reindex activity. I change the DB to bulklogged and trigger the reindex and post reindex i bring it back to FULL. Any idea why the log file grows greater then MDF file.
Scheduled reboot of server happens when the reindex job runs. Can this be a reason for the log file growth.

Are you ONLY doing REBUILDs or is REORGANIZE also a part of the mix?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

premkuttan SSCommitted Points: 1934 More actions · Answer 2

It has both reorg and rebuild. My index job runs through a different tool so it keeps to retry even if there is a interruption in SQL connection

Jeff Moden SSC Guru Points: 1003863 More actions · Answer 3

premkuttan wrote:

It has both reorg and rebuild. My index job runs through a different tool so it keeps to retry even if there is a interruption in SQL connection

REORGANIZE is probably NOT doing you any good (it does NOT do what most people think it does) and it's not the quiet little kitten (especially when it comes to log file usage) that everyone makes it out to be. For example, I had a 147GB clustered index that had "end of index" fragmentation at only 12% logical fragmentation. When I used REORGANIZE on it, the log file exploded from ~20GB to 227GB! and took an hour and 21 minutes to complete. An offline REBUILD in the Bulk Logged Recovery Model took only 12 MINUTEs and 44 seconds and the log file only grew from ~20GB to only 37GB.

While REORGANIZE will certainly fix "logical fragmentation", it actually causes most fragmented indexes to be setup to perpetuate massive fragmentation almost immediately. The reason is because REORGANIZE compresses partially full pages up to the Fill Factor but it cannot create new pages and so the fullness of any pages above the fill factor value cannot be reduced, which furthers fragmentation, as well.

No... even though I've done about 600 hours of testing on index maintenance, you don't have to take my word for it. Take what's written in the documentation... my testing just proves it. Goto the following link and carefully read the following paragraph in that link. Then read it until you actually understand the impact of what it says.

https://docs.microsoft.com/en-us/sql/relational-databases/system-dynamic-management-views/sys-dm-db-index-physical-stats-transact-sql?view=sql-server-ver15

So, what is that saying???

It's saying that you (and most of the rest of the world that uses current "Best Practices") probably don't know what REORGANIZE actually does to your indexes and don't understand how it perpetuates fragmentation and adds further proof to the fact that what people are calling and using as "Best Practices" for index maintenance is actually a WORST PRACTICE.

My current recommendation is that the only two places you should use REORGANIZE is to compress LOBs (and only when needed) and (possibly) to fix the instant fragmentation caused on "Ever-Increasing" indexes that have a pattern of being inserted into and then being made to suffer "ExpAnsive Updates" to those recently inserted rows. Even then, be warned of the high cost of using REORGANIZE for anything when it comes to the log file and the fact that it's going to setup indexes that fragment for more fragmentation. About the ONLY thing it does on most indexes is to remove logical fragmentation. It does absolutely NOTHING to actually prevent fragmentation when you've done something like lowering the Fill Factor to try to prevent fragmentation. It's totally useless there.

So, with the understanding that there's a whole lot more to understand, my recommendation is to stop using REORGANIZE (except for LOB compression and then follow that with a REBUILD) and only do REBUILDs. If you can't do offline REBUILDs, online REBUILDs are a decent second choice. If you don't have the Enterprise Edition and so can't do online REBUILDs and can't afford the time to do offline REBUILDs (which are nasty fast even in the full recovery model), then simply wait until you can. Whatever you do, don't blindly use REORGANIZE because it will actually compress your indexes and make them fragment all day every day. Instead, just rebuild stats until you can rebuild your indexes. Doing no index maintenance is better than doing it wrong and, except for some rare instances, using REORGANIZE on indexes that fragment is doing it wrong.

Last but not least, consider NOT doing index maintenance on indexes that have a "0" Fill Factor if they're fragmenting. Doing index maintenance on such indexes (whether using REBUILD or REORGANIZE) will only remove ALL free space and be the cause of massive page splits and fragmentation. Either assign a proper Fill Factor or just rebuild stats until you can so that you don't suffer from the "morning after" blockages that will occur.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

ScottPletcher SSC Guru Points: 100948 More actions · Answer 4

I've found one other use for REORG. When I've done massive deletes on a table, I've found that a REORG first is sometimes better than a REBUILD first. Follow the REORG with a REBUILD. Not sure why that is, but I've seen it.

SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

Jeff Moden SSC Guru Points: 1003863 More actions · Answer 5

Ah... gotta be careful even with those conditions. If you can tolerate the extreme log file usage on such things (the 147/227GB table I cited above was fragmented only my near-end-of-index deletes), then maybe. Just know that it's going to cause a whole lot more log file usage than a rebuild and that's a part of what the op is having problems with.

Personally, I don't use REORGANIZE at all even when LOBs need compaction (it also took 10 REORGs to get the LOBs back into shape in several tests I ran). I also force LOBs to be out of row because of the "Trapped Short Page" problems that in-row LOBs create. Why MS decided to change the default to in-row in 2005 is totally beyond me. I see absolutely no redeeming qualities for in-row LOBs. I have a whole separate presentation on that and how the same problems frequently occur for larger NVARCHAR and VARCHAR columns even when they're not LOBs (which also means you should frequently consider converting them to LOBs even though they'll never get that big).

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

ScottPletcher SSC Guru Points: 100948 More actions · Answer 6

I don't think you should be repeatedly rebuilding 240GB of data. Much of that data must be historical, i.e. unchanging, so there's no need to use vast resources to keep rebuilding it.

Instead, partition the table: either with formal partitioning or by using views. Put the old-enough-it's-not-changing data into a separate partition(s)/table(s) from the active data. Then rebuild only the active part. As necessary, adjust the partition border/range to move data into the historical partition/table.

SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

ScottPletcher SSC Guru Points: 100948 More actions · Answer 7

Jeff Moden wrote:

and can't afford the time to do offline REBUILDs (which are nasty fast even in the full recovery model), then simply wait until you can.

Not necessarily. You seem never to use page compression. Too bad, it's a great feature. Especially for a 240GB table! But rebuilds with page compression are NOT "nasty fast"; they are, in fact, rather slow.

SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

ScottPletcher SSC Guru Points: 100948 More actions · Answer 8

I still have to disagree on REORG if: (1) you can't use online rebuilds AND (2) the table has some serious fragmentation you need to address. It helps if you've properly split your historical and current data into separate partitions/tables, but it's not an absolute requirement IF you've rebuilt the table fairly recently, since the historical data was compacted by that rebuild.

Prior to the REORG, you drop your fill factor below what it normally is, typically by 5 to 10%. REORG will only compact pages that need compacted. The pages that are compacted will then be less dense than they would be after a rebuild. I think this makes sense since these are the pages that are getting fragmented, then they may need some additional freespace (freespace = inverse of fillfactor, i.e. 90% fill = 10% free).

When I was on SQL 2008 Standard, I used to have to use REORG moderately frequently, and generally it did work well. Now I'm on SQL 2016 Enterprise (working now full time on an upgrade to SQL 2019E), so I almost hever have to futz about with REORG very often. Only, as I said, if I've done massive deletes on a very large table. The REORG in that case is being used solely to remove empty pages, so, from what I've found, it's actually more efficient than a rebuild, because that typically affects (nearly) all pages. Again, I find REORG works well for that.

SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

Jeff Moden SSC Guru Points: 1003863 More actions · Answer 9

ScottPletcher wrote:

I still have to disagree on REORG if: (1) you can't use online rebuilds AND (2) the table has some serious fragmentation you need to address AND (3) you've properly split your historical and current data into separate partitions/tables.
Prior to the REORG, you drop your fill factor below what it normally is, typically by 5 to 10%. REORG will only work on pages that need worked on. The pages that are touched will then be less compacted than they would be after a rebuild. I think this makes sense in that if these are the pages that are getting fragmented, then they may need some additional freespace (freespace = inverse of fillfactor, i.e. 90% fill = 10% free).
When I was on SQL 2008 Standard, I used to have to use REORG moderately frequently, and generally it did work well. Now I'm on SQL 2016 Enterprise (working now full time on an upgrade to SQL 2019E), so I almost hever have to futz about with REORG very often. Only, as I said, if I've done massive deletes on a very large table. The REORG in that case is being used solely to remove empty pages, so, from what I've found, it's actually more efficient than a rebuild, because that always affects all pages. Again, I find it works well.

You can say that "REORG will only work on pages that need worked on" all you want and you'll get no argument from me. That doesn't change a thing about what I said, though. REORGANIZE of a 147GB CI at only 12% fragmentation and only near the "recent end" of an ever-increasing index grew the log file from ~20GB to 227GB. A Rebuild only caused the log file to grow to 37GB (both tests were done on restores from the same backup file).

Like I said, REORGANIZE doesn't work the way people (including you) think it does and I've done the testing that proves that. If you want to continue to use REORGANIZE for such a thing, heh... knock yourself out. Just be advised that it can also knock your logfile out. 😉

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

ScottPletcher SSC Guru Points: 100948 More actions · Answer 10

Jeff Moden wrote:

ScottPletcher wrote:
I still have to disagree on REORG if: (1) you can't use online rebuilds AND (2) the table has some serious fragmentation you need to address AND (3) you've properly split your historical and current data into separate partitions/tables.
Prior to the REORG, you drop your fill factor below what it normally is, typically by 5 to 10%. REORG will only work on pages that need worked on. The pages that are touched will then be less compacted than they would be after a rebuild. I think this makes sense in that if these are the pages that are getting fragmented, then they may need some additional freespace (freespace = inverse of fillfactor, i.e. 90% fill = 10% free).
When I was on SQL 2008 Standard, I used to have to use REORG moderately frequently, and generally it did work well. Now I'm on SQL 2016 Enterprise (working now full time on an upgrade to SQL 2019E), so I almost hever have to futz about with REORG very often. Only, as I said, if I've done massive deletes on a very large table. The REORG in that case is being used solely to remove empty pages, so, from what I've found, it's actually more efficient than a rebuild, because that always affects all pages. Again, I find it works well.
You can say that "REORG will only work on pages that need worked on" all you want and you'll get no argument from me. That doesn't change a thing about what I said, though. REORGANIZE of a 147GB CI at only 12% fragmentation and only near the "recent end" of an ever-increasing index grew the log file from ~20GB to 227GB. A Rebuild only caused the log file to grow to 37GB (both tests were done on restores from the same backup file).
Like I said, REORGANIZE doesn't work the way people (including you) think it does and I've done the testing that proves that. If you want to continue to use REORGANIZE for such a thing, heh... knock yourself out. Just be advised that it can also knock your logfile out. 😉

Did you reduce the fillfactor before you ran the REORG?

It's likely not worth reorg or rebuild for only 12% on a large table, again unless you split the historical data out first.

I know, you know everything and the "rest of the world" knows nothing.

SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

ScottPletcher SSC Guru Points: 100948 More actions · Answer 11

Presumably that 147GB CI had (nearly) every page out of logical order or (nearly) every page was below the fillfactor. Hmm, veryh odd, interesting.

Again, I can't imagine needing an unpartitioned table that large. Was there actually 147GB of active data? We deal with very large volumes of data, but not that much in one table that is active at the same time. That would be a real challenge to manage.

I don't have time to set up and do full tests on any of this stuff now. When I get a chance, I will do that.

SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

ScottPletcher SSC Guru Points: 100948 More actions · Answer 12

I do need to add one caveat: if you DELETE rows throughout the table, you probably should avoid reorg. Once our data exists, we don't delete it, it never just disappears. We only delete as part of a move of the data to historical/archive data.

SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

ScottPletcher SSC Guru Points: 100948 More actions · Answer 13

Jeff Moden wrote:

ScottPletcher wrote:
I still have to disagree on REORG if: (1) you can't use online rebuilds AND (2) the table has some serious fragmentation you need to address AND (3) you've properly split your historical and current data into separate partitions/tables.
Prior to the REORG, you drop your fill factor below what it normally is, typically by 5 to 10%. REORG will only work on pages that need worked on. The pages that are touched will then be less compacted than they would be after a rebuild. I think this makes sense in that if these are the pages that are getting fragmented, then they may need some additional freespace (freespace = inverse of fillfactor, i.e. 90% fill = 10% free).
When I was on SQL 2008 Standard, I used to have to use REORG moderately frequently, and generally it did work well. Now I'm on SQL 2016 Enterprise (working now full time on an upgrade to SQL 2019E), so I almost hever have to futz about with REORG very often. Only, as I said, if I've done massive deletes on a very large table. The REORG in that case is being used solely to remove empty pages, so, from what I've found, it's actually more efficient than a rebuild, because that always affects all pages. Again, I find it works well.
You can say that "REORG will only work on pages that need worked on" all you want and you'll get no argument from me. That doesn't change a thing about what I said, though. REORGANIZE of a 147GB CI at only 12% fragmentation and only near the "recent end" of an ever-increasing index grew the log file from ~20GB to 227GB. A Rebuild only caused the log file to grow to 37GB (both tests were done on restores from the same backup file).
Like I said, REORGANIZE doesn't work the way people (including you) think it does and I've done the testing that proves that. If you want to continue to use REORGANIZE for such a thing, heh... knock yourself out. Just be advised that it can also knock your logfile out. 😉

I am curious about how the log was able to reach 247GB. It must not have been backed up during the reorg, since reorg is done in many small trans not one mega trans like a rebuild. But typically logs are backed up every 15 minutes. Now, if you were able to get 247GB written for one table on one db in 15 min., then huge congrats on your i/o subsystem! If not, something's odd. There had to be something preventing the log from being truncated.

SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

Sergiy SSC Guru Points: 110206 More actions · Answer 14

How massive are your DELETEs?

Probably it would be easier to copy the remaining part of data to a new table (exact copy of the old one) and then drop old one and rename the new one to the old name.

All indexes will be freshly built, and almost no logging involved.

_____________
Code for TallyGenerator