Introduction to the Transaction Log

  • Comments posted to this topic are about the item Introduction to the Transaction Log

    ~ Without obstacles, you cannot progress ~
    http://sqln.blogspot.com/

  • So it is not only me getting fed up explaining, after all. Not understanding how the transaction log works seems to be a genetic defect of homo sapiens.

    Here is a script to find out who is filling up the transaction log the most

    create procedure sp_find_who_uses_log

    as

    select top 5 tdt.database_transaction_log_bytes_used,

    tdt.database_transaction_begin_time,

    tdt.database_transaction_state,

    qt.text

    from sys.dm_tran_session_transactions tst,

    sys.dm_tran_database_transactions tdt,

    sys.dm_exec_requests er

    CROSS APPLY sys.dm_exec_sql_text(er.sql_handle) AS qt

    where tst.transaction_id = tdt.transaction_id

    and tst.session_id = er.session_id

    --and qt.dbid = 6 -- Filter by database

    order by tdt.database_transaction_log_bytes_used desc

  • Nice post. 🙂 One small detail caught my eye. You enumerated areas where the tran log can be useful, and one of those was for "quickly importing data from old database versions to a new database version". Offhandedly, I can think of replication as a specific example of this, but might you be able to expound on some other way (if any) this can be applied? Possibly from a data migration perspective, with emphasis on using two similar, but slightly different DB versions (schemas)?

    Thanks!

  • In other databases engines i have been able to translate the log to text in order to aid with fault finding. Is this possible with MS SQL Server. i.e take a copy of the log and run it up in an editor to view what has been going on.

    Cheers

    Steve

  • steve.lambert (7/2/2008)


    In other databases engines i have been able to translate the log to text in order to aid with fault finding. Is this possible with MS SQL Server. i.e take a copy of the log and run it up in an editor to view what has been going on.

    Cheers

    Steve

    Not quite. SQL Server stores everything, well mostly, in pages. You can't read the data in your tables directly with a text editor and the same is true with the logs. Red Gate and Apex make some nice tools for dealing with log events.

    ATBCharles Kincaid

  • Great article and thanks for taking the time to write it. I'm not necessarily "seasoned" but after six years with SQL Server, this still sheds a bit of light on that elusive file.

  • Good article James. I gave you 4 on it. Please consider expanding it a bit.

    Did you know that SQL Server actually writes everything to the log first and then to the .mdf? I actually had one of my servers choke, due to another process, right after I entered a cascading DELETE statement. I checked the recovery logs on restart. SQL noted that the DELETE was validated and recorded but not performed. The recovery rolled my transaction forward including enforcing the DRI down to my subordinate tables. 😎

    You could add some about the different recovery models. Some about the [[[JOY]]] of trying to recover from missing log files. Not missing log backups but missing files. The poor DBA confronted one morning with the fact that they took out the development server. They had copied all the .mdf files after detaching. "Here's the new server. Just reattach these databases." "Where are the log files?" "Did you need those? We have already secured the old drive." Secured means taking it to the machine shop to be "introduced" to a drill press and sledge. I don't want to talk about it any more. It was ugly.

    ATBCharles Kincaid

  • Nice article,

    Just a small history.

    Transaction logs being nescessary for transaction rollback and rollforward as defined by Codd & Date in the early 1970's have been part of RDBMS technology all along.

    MS jumpstarted its position in the RDBMS marketspace by co-licensing and then co-opting the Sybase v4.2 DB engine back in the mid-1980's. Its roots are still evident today in the presence of the 'master' db. MS did this because at the time, Sybase was the technology leader in part because their implementation of db cache was 100% separate (raw disk partititions) from O/S cache. By putting the DB cache under DB control, data loss is reduced and performance maximized based on the DB and not the O/S requirements which are often at odds.

    Yes, I have no life..

  • Charles,

    Thank you for the kind comments. This article originally started based on there being so many people asking why the transaction log suddenly ballooned in size over night, then the common "how do I shrink the file..." and the "SQL Server isn't truncating my transaction log. I can tell because it isn't getting smaller in size..."....:w00t: I figured that people (especially the newer DBAs) would appreciate having some sort of document that can help them to understand what is going on.

    I was waiting for this article to be released and see what feedback I would get first; however, I currently do plan on expanding on this subject. In particular to discuss how the transaction log can (or won't) be a factor in database recovery models. I also have a few other "...and did you know..." that I'm going to work into there. I think this area of topic tends to be under publicized and the documentation can get to be a bit too technical at times.

    Thanks,

    James Rea

    ~ Without obstacles, you cannot progress ~
    http://sqln.blogspot.com/

  • Bill,

    Thank you for the great detailed history there! 🙂

    Thanks,

    James Rea

    ~ Without obstacles, you cannot progress ~
    http://sqln.blogspot.com/

  • Hi,

    This is a quick update to a major typo in the article.

    In the section titled "Shrinking the Log"; I typed in that the command FILESHRINK is used to shrink the file. In actuallity the proper command is SHRINKFILE. You can view the proper command, syntax, and description on MSDN at: http://msdn.microsoft.com/en-us/library/ms189493.aspx. You can also find the link at the end of the article in the section titled "Additional Resources".

    I apologize for this typo and any others you may find.

    Thank you,

    James Rea

    ~ Without obstacles, you cannot progress ~
    http://sqln.blogspot.com/

  • I see someone already mentioned that the transaction log is a write-ahead log, which, IMO, is its most important distinguishing feature. The transaction log is a core operational strategy employed by SQL Server. It is not a record of "what happened", not an added feature, and not in the slightest optional.

    I think that most people who complain about ballooing log files are confused by the term "log." Let me try some analogies here.

    A paper check register is a type of log. You write a check, then record it in the register (don't you hate the people who hold up the line doing this?). But, you might forget to do that. You might also log the wrong amount. And the check doesn't suffer for it--it's still paid as you wrote it. Your check register is only valid if you are accurate and diligent. Not only that, but your bank couldn't care a bit if your register is wrong--it has no bearing.

    That's a write-after log. Take action, record it. And it's what most people think of when you say "log."

    Now, suppose that you don't write checks by hand, you only print them from a finance program. First, you enter into the program all the details about the check. Then, the program uses those details to print the check. The record of the check in that program is a write-ahead log. The check can't exist without that log entry, and the log entry was used to create the actual check.

    SQL Server uses a write-ahead transaction log. A committed transaction goes to the log FIRST. The lazy writer uses the LOG as its source of modifications to make to the actual database files at the next checkpoint. Before that checkpoint occurs, your committed data exists only in the log. Therefore, the log is always accurate and complete because the data wouldn't be there without it.

    And, I have to take issue with the MS Access origin of the transaction log. I was working with SQL Server 6.0 at the same time as MS Access 2.0 was hot. SQL Server has used a write-ahead log for its core operation for longer than MS Access has existed! I think any feature copying was the other way around.

  • Stephanie

    Your analogy is valid, correct, and well stated. It is possible to turn it off, however. Set recovery mode to simple and nothing gets logged. In that sense it is optional. It's not wise to do that under most circumstances.

    ATBCharles Kincaid

  • Excellent article James.

    It should be required reading as entry criteria to join SSC !

    RegardsRudy KomacsarSenior Database Administrator"Ave Caesar! - Morituri te salutamus."

  • Charles Kincaid (7/2/2008)It is possible to turn it off, however. Set recovery mode to simple and nothing gets logged. In that sense it is optional.

    Um... no, sorry, it doesn't work that way.

    You cannot write anything to the database without it being logged FIRST. Maybe there are some exceptions in bulk operations, but that's more a QOTD topic.

    I mentioned checkpoints in my previous post, and that's the key to simple recovery mode. When you run a statement that modifies the database, the actions you take are written to the log immediately upon commit (or even before? I think sometimes before commit). The log file writes are immediately flushed to disk. (A good reason to give a large database a separate drive for its log files)

    When a checkpoint occurs (various methods of determining when... search BOL for more info), the lazy writer (called lazy because it doesn't write until a checkpoint) reads the log entries since the last checkpoint and writes the changes to the database files (mdf file). Then the checkpoint is written to the log.

    That's when simple recovery comes into play. If simple recovery is turned on, the log is truncated after each checkpoint. If full recovery is set, the log isn't truncated until it is backed up with truncate option. Full recovery with an appropriate backup/restore strategy enables point-in-time recovery.

    Without a transaction log, SQL Server would be no better than Microsoft Word. When you're working in a Word document, what happens when your desktop machine is suddenly unplugged? Hopefully, Auto Save has some older version of your document for you. What did Auto Save get, though? Is it consistent? Were you in the middle of typing a sentence? Who cares, it's a Word document, it doesn't have data integrity rules! Now, if you unplug your SQL Server while it's running (should not be possible on a production machine, but just supposing), what happens? The transaction log is already written to disk. When SQL Server restarts, it recovers itself by reading the log. It rolls forward and back until it ends at a point where every transaction committed before it lost power is there, and consistent, in the database. This works even for Simple Recovery mode. It has to work, otherwise pulling the plug could corrupt your database.

    The author of the article that started this disucssion has a good point. DBAs must understand how and why the transaction log works. It's fundamental to knowing what is possible with SQL Server. Knowing how recovery works gives you ammunition to convince a customer that an Excel workbook is not a database! (I haven't had to do that since the '90s... showing my age here...)

Viewing 15 posts - 1 through 15 (of 64 total)

You must be logged in to reply to this topic. Login to reply