“My transaction log is too big!” I wish I had a dime for each time phrase has been spoke, written, tweeted, etc. I’ve read about it in various forums, blog posts etc. Discussed it with co-workers and other DBAs I know. I know it’s not, but sometimes it feels like the most common complaint I’ve ever heard. And in almost every case the reason behind it can broken down into one of two distinct possibilities.
First: You have a problem.
About now you generally hear the person who made the statement in the first place say “Duh! I just said that. Haven’t you been paying attention?”
Before I go any farther I should probably explain something about log files. Internally a log file is broken up into smaller pieces called VLFs (Virtual Log File). So what exactly is a VLF? Well here is a good description but for this discussion it is enough to know that a VLF has two states “In Use” and “Ready for use”. This is probably a bit of an oversimplification, I’m not entirely sure, but it’s close enough for right now. If the Database Engine is writing to the log file, and reaches the end of the last VLF, it checks the availability of the first VLF (log files are circular). If the first VLF is also “In Use” then the Database Engine is going to try to grow the log file. If it can’t for some reason (size restrictions, out of space on the drive etc) then you get an error. If it can grow, then more VLFs are allocated depending on the size of the growth. Here’s the kicker. Each and every bit in those new VLFs has to get set to 0. FYI this isn’t affected by the setting “Instant File Initialization”. That only affects data file growth. Writing all of those 0s takes time. If a large amount of space is allocated, then it can take a large amount of time. Either way for someone waiting on a transaction to complete it’s going to feel slow. This is one of the reasons that autogrowth on a log file is considered a pain and should be avoided if possible. On the other hand not allowing autogrowth can end up being an even bigger pain if you run out of space. This is one reason why everyone suggests “right sizing” your log file to begin with. However, that’s another discussion for another time.
Next point about transaction logs. The ONLY thing that will free a VLF for reuse (assuming FULL or BULK-LOGGED recovery) is a transaction log backup. In the rest of this article I’m going to talk about reasons why a VLF might not be freed for reuse. If your database is in FULL or BULK-LOGGED recovery mode, then every time I say “reasons why a VLF might not be freed for reuse” I want you to add “after a transaction log backup”. Repeat after me: “reasons why a VLF might not be freed for reuse AFTER A TRANSACTION LOG BACKUP”. If your database is in SIMPLE recovery mode then a VLF will be marked for reuse once every transaction within it is no longer needed for something else. In the case of SIMPLE recovery the statement “reasons why a VLF might not be freed for reuse” can stand on its own.
So, now after all that, what are the possible problems? Remember that if the very first VLF is not “Ready for use” then the transaction log can’t roll around and has to grow. There are several reasons the VLFs might not be freed up. There might be one or more transactions that are open and left “hanging”. For example a hypothetical power user may have an MS Access application that opened a transaction, but for some reason he didn’t commit the transaction. Then he locked up his computer and left for vacation. Sorry for the bitter moment there but I want you to imagine how difficult that one might have been to track down. Hypothetically of course. Another possibility is a problem with Mirroring or Replication. When these types of features are enabled a VLF can’t be marked “Ready to use” until the transactions within it have been sent to the appropriate location (Mirroring partner, Replicated DB etc). If this is the case then the problem must be fixed and Mirroring and/or Replication must be allowed to catch up before the VLFs are going to be freed up. Always On may be the same but I haven’t read enough on it to be certain.
The most common issue I put in the “Problem” category is also the simplest. And oddly it’s one that is frequently caused because people don’t realize it’s a problem. The database is in FULL or BULK-LOGGED recovery mode and no transaction logs are being taken. Remember what I said above, in FULL or BULK-LOGGED recovery the VLFs won’t be marked “Ready to use” until a transaction log backup is created. Over time even with the smallest, low use database, the log file is going to get HUGE. I once inherited a database a while back that had a 30MB data file, and a 2GB log file! The database had been around about 7 years and had never had a log backup! I’ve heard all types of reasons for this but they primarily come under one of the following types.
- “It’s a small database with only a few transactions, log backups aren’t needed.”
- “It’s a test/dev database. We don’t care about log backups.”
- “We don’t take transaction log backups so we can save space.”
So let’s go over these arguments one at a time. “It’s a small database with only a few transactions, log backups aren’t needed.” I have only one thing to say here. 30MB data file with a 2GB log! Next is “It’s a test/dev database. We don’t care about log backups.” I really have a hard time with this one. If you don’t care about log backups why are you using FULL or BULK-LOGGED recovery? The whole purpose of taking transaction log backups is so that you have point in time recovery. If you don’t care about point in time recovery then go with SIMPLE recovery, that’s what it’s there for. If you do care about point in time recovery then take the bleeping log backups! One thing I should point out here is Operational Recovery. Just because you don’t care about point in time recovery doesn’t mean that the developer who just dropped a stored procedure, table, etc in dev and would really like it back doesn’t. The last category of “reasons” (“We don’t take transaction log backups so we can save space.” ) is mostly a misunderstanding of how the backups work. Let’s say you end up with a situation where you have that 30MB data file and the full 2GB log file. When you take a full backup can guess how much is actually going to be backed up? I’ll give you a hint, it’s not 30MB. In fact by not taking the log backups you are actually losing more and more space over time.
Second: One or more transactions caused your log to be the size it is.
You’ve decided that in fact that you don’t have a problem. Replication is working fine, transaction logs are being taken on a regular basis etc. So why is the log so big? Well, one or more processes are running against the database that require that much space. Typically these processes are the normal workload and if that’s the case, leave the transaction log alone. It requires that much space and anything you do to shrink it will cause you grief.
There are a couple of possibilities that might let you clear up some space. First possibility: there was a large one time process that has run that took up an unusual amount of log space. Examples might include a onetime large data load or archiving/deleting a large amount of data. In this case the particular process that caused your log file to grow is over and you are safe in reducing the size of your log file to its previous (hopefully stable) size.
The other possibility is that while you are taking transaction log backups you aren’t taking them frequently enough. If you have a lot of small transactions (and only small transactions) then by increasing the frequency of your transaction log backups (assuming FULL or BULK LOGGED recovery) you will clear some space out of your log for reuse. For example, a high transaction database has a log file of 3GB that fully used at least some point in the day. This means that at some point in the day the vast majority of the VLFs (virtual log files) are marked for something other than reuse. The transaction log backups are taken every 30 minutes. If you change the backups to run every 10 minutes then (again assuming lots of small transactions) your log file will be cleared out quickly enough that you could potentially shrink your file down to close to 1 GB and still have enough room. This is only going to work under very specific circumstances however, so don’t get your hopes up. Remember that if you have even one “large” transaction running on a regular basis, say daily, that say requires most or all of that 3GB you aren’t going to be able to keep the transaction log smaller than that.
If neither of these possibilities fit your situation then you are stuck. The log file is the size it is for a reason. The workload requires it. So what will happen if you shrink the log file? It’s going to just grow right back to its previous size. It may even grow larger depending on the autogrowth settings for your log file. So how about shrinking it and capping the size? Well, your log file isn’t going to grow as big, but the transactions that required that extra space, be it one larger transaction, or lots of little ones are going to start to fail due to lack of log space. So what is the solution here for the “large” log file? Get used to it. Decide that is the correct size, and if you need more disk space then add some. It’s really the best option.
Filed under: Backups, Microsoft SQL Server, SQLServerPedia Syndication Tagged: backups, log files, microsoft sql server, VLF