Introduction to the Transaction Log

  • Now to add on to what Charles has already said...

    The flip side of the coin with the performance between SIMPLE and FULL is that if you have a large number of transactions per second committing, then it's plausible that a SIMPLE recovery model could actually hinder performance more than a FULL recovery model would.

    Say you have around 1,000 transactions per second being committed. While the FULL recovery model will log more details to the log, the SIMPLE recovery model will log minimal details....but, where the performance is loss is when these transactions are committed. The FULL recovery model will continue to add transactions to the list in the log; while the SIMPLE recovery model will continously clear (TRUNCATE) the log upon each checkpoint.

    If you think about running a TRUNCATE 1000x/sec compared to allowing the log to just continue to fill up (assuming the log is appropriately sized for this), then you will notice the SIMPLE recovery model is continously working to release log space upon checkpoints while the FULL recovery model is just waiting until you say there is a checkpoint to then TRUNCATE the log (typically occurs when a log backup is executed, although other methods can also be used).

    If a transaction is immediately commited in the above scenario then SIMPLE is much slower than FULL; however, if you change the method of committing to say commit every 10,000 transactions (should the scenario allow it, such as an ETL situation) then this could definitely change the entire aspect of which model is slower.

    Also, take other factors in to account such as where the log is stored, how it's sized out, the total number of transactions in a given time frame, the size of the transactions (on average), as well as the method of which transactions are being committed (BULK INSERTs is another example of where the proper recovery model can also impact the performance of the log, but that's getting very specific scenario).

    As Charlse points out, take into account the aspect of performance vs safety...and hopefully I am pointing out that the situation can also change the aspect of speed upon different models depending on a varying array of factors.

    Hope this helps!

    James

    ~ Without obstacles, you cannot progress ~
    http://sqln.blogspot.com/

  • interesting

    i have two databases that are replicated and generate tens of millions of commands a few times a month. Most of the db is deleted and new data inserted continuously. One seems to replicate fine and the other is always backed up during the big batch processes. I just checked and the one that's OK is Full Recovery Model and the problem DB is Simple. We set it up like this 18 months ago.

  • Charles Kincaid (11/9/2009)


    Full Recovery mode writes more stuff that Simple Recovery mode.

    James_DBA (11/9/2009)


    While the FULL recovery model will log more details to the log, the SIMPLE recovery model will log minimal details

    I am sorry but I think you both got it wrong. What makes you think the amount of data logged is different in Simple and Full recovery modes? The only difference in those two modes is how long SQL Server holds on to this data, not in the data logged itself. In Simple recovery mode the log is self truncated after a checkpoint. In Full recovery mode a DBA must maintain the log and use BACKUP command to manage it.

    The only mode that logs different data to the log is the Bulk Logged.

    ---------------------------------------------
    [font="Verdana"]Nothing is impossible.
    It is just a matter of time and money.[/font]

  • Good observation. Yep, I can be wrong. So then does it come down to "Is Simple worth the risk?"

    ATBCharles Kincaid

  • I would say yes. It is a judgement call between convenience and data loss risk.

    I have to admit that my previous statement about the data logged is not exactly correct because in Full recovery mode there some items that are being logged (or could be logged) that Simple recovery mode does not need (or support). But I think as far as the performance is concerned the difference is negligible.

    For example you can use marked transactions to mark the restore point in case of failure. One of the biggest advantages of Full recovery mode is the ability to perform 'point-in-time' restores.

    On the other hand I think the biggest problem with point-in-time restore is finding that 'point-in-time'. If the objective of the restore operation is restore data due to the media failure then in most cases you would want to go all the way and restore to the latest possible point.

    If you have to restore the data due to the operator error (someone updated the wrong data) then it is much harder to find that 'point-in-time'. Marked transactions could help find that 'point-in-time'.

    ---------------------------------------------
    [font="Verdana"]Nothing is impossible.
    It is just a matter of time and money.[/font]

  • magarity (11/7/2009)


    Since all I have is a dev server I wish the logs would go away automatically. I understand the importance of production systems' logs but I have no need for them - if there's a problem of any kind, I just reload my test files in barely a minute. From my POV logs hanging around after commit just clog up space, dang things.

    If you use the Simple recovery mode than you do not have to worry about the logs on the development server. The SQL Server will deal with them automatically. Unfortunatelly (or fortunatelly - depending on your point of view) you must have a transaction log for each database on SQL Server. You may not have a need for them but they are critical to the SQL Server's reliability.

    ---------------------------------------------
    [font="Verdana"]Nothing is impossible.
    It is just a matter of time and money.[/font]

  • JacekO (11/9/2009)


    One of the biggest advantages of Full recovery mode is the ability to perform 'point-in-time' restores.

    My mileage varies here. For me unplanned server restarts are the big one. Having a customer call to say that the cleaning person unplugged the SQL server box to use the vacuum, again, and there was no data loss is kind of a big thing. The point-in-time thing can come close if you restore you last full and then the differentials up to the point in time. How frequent are your backups? Is a full every day and logs every 15 minutes enough? Is one per minute way too much?

    Here is a question for someone who has the time: Can I run a query that shows if there is any un-backed-up changes in my database?

    ATBCharles Kincaid

  • Charles Kincaid (11/9/2009)

    My mileage varies here. For me unplanned server restarts are the big one. Having a customer call to say that the cleaning person unplugged the SQL server box to use the vacuum, again, and there was no data loss is kind of a big thing. The point-in-time thing can come close if you restore you last full and then the differentials up to the point in time. How frequent are your backups? Is a full every day and logs every 15 minutes enough? Is one per minute way too much?

    Here is a question for someone who has the time: Can I run a query that shows if there is any un-backed-up changes in my database?

    Ok, now don't make fun of this query please...I made this all up as I just went along trying to see if I could come up with something along the lines of what you are wondering if is even possible, Charles...I'm sure a TSQL whiz on this site can really clean this up and make much more efficient...with that said, try this out...

    SELECT * FROM #LogInfo

    WHERE ([CURRENT LSN]) > (SELECT [Current LSN]

    FROM #LogInfo

    WHERE [Checkpoint Begin] = (SELECT TOP(1) [Checkpoint Begin]

    FROM #LogInfo

    WHERE [Checkpoint Begin] IS NOT NULL

    ORDER BY [Checkpoint Begin] DESC))

    Basically what this is doing is copying the LOG file contents into a temporary table (#LogInfo). Then it will find the last "Checkpoint Begin" date and use that LSN against the table and provide you with all the LSNs that are after that LSN.

    The concept here is that

    A) if you get back 10 or less rows then your pretty much have a completely backed up database/log (# of rows will change depending on type of backup returned, also these few "default" rows appear to be header type rows for the next portion of the log), and

    B) if you've done backups of the database (FULL or DIFF) then the log creates a new "Checkpoint Begin" entry...so you are only concerned with the latest data from after that LSN.

    Now, I did all of this on a database using FULL backup...so, could not be same for SIMPLE. Also, I did not test this out very well, so I'm sure there are some bugs with the logic and/or TSQL. But, it should get the ball rolling...to a script that would give details of what you are wondering.

    Hope this helps!

    James

    P.S...I created this on SQL 2008; should work just fine on SQL 2005 as I don't see anything that isn't 2005 compatible...SQL 2000 users probably need find another method (I'm not sure but I think there was a table called "syslog" that contained this data in SQL2000...anyone confirm?)

    [EDIT] Also, this script needs to be ran from within the database the log contents being inspected are. In case it needs to be said, and to CMA here, this script is untested and should only be ran from within development environments. The speed of the script is dependant on the size of the log file; in otherwords if your log file is Gigabytes in size it will take a long time to run! [/EDIT]

    ~ Without obstacles, you cannot progress ~
    http://sqln.blogspot.com/

  • Here is a simplified version of my last query...

    SELECT * FROM #LogInfo

    WHERE ([CURRENT LSN]) > (SELECT TOP (1) [Current LSN]                                                         FROM #LogInfo                                                         WHERE [Checkpoint Begin] IS NOT NULL                                                                                                                         ORDER BY [Checkpoint Begin] DESC)

    I'm not at my desktop, so I didn't test this yet to ensure it didn't get broken after revising; use at your own risk.

    Hope this helps,

    James

    ~ Without obstacles, you cannot progress ~
    http://sqln.blogspot.com/

  • JacekO (11/9/2009)


    The only difference in those two modes is how long SQL Server holds on to this data, not in the data logged itself. In Simple recovery mode the log is self truncated after a checkpoint. In Full recovery mode a DBA must maintain the log and use BACKUP command to manage it.

    The only mode that logs different data to the log is the Bulk Logged.

    In simple recovery, bulk operations are also minimally logged. Not clearly stated in BoL, but see the intro paragraph here: http://msdn.microsoft.com/en-us/library/ms191244.aspx, also implied here: http://msdn.microsoft.com/en-us/library/ms191484.aspx

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • James_DBA (11/9/2009)


    While the FULL recovery model will log more details to the log, the SIMPLE recovery model will log minimal details....

    Only if those inserts could be minimally logged. Otherwise the logging will be the same

    but, where the performance is loss is when these transactions are committed. The FULL recovery model will continue to add transactions to the list in the log; while the SIMPLE recovery model will continously clear (TRUNCATE) the log upon each checkpoint.

    Why do you say there'll be a performance loss on commit if the log is getting cleared (of inactive portions only) on a checkpoint?

    Got some tests that show the performance degradation?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Thanks James. I'm having a great deal of trouble seeing your script. Must be this silly IE 8.

    ATBCharles Kincaid

  • Why do you say there'll be a performance loss on commit if the log is getting cleared (of inactive portions only) on a checkpoint?

    Got some tests that show the performance degradation?

    Looking at it rationally rather than empirically, I'd say that since a log file is written to sequentially, any activity (such as truncation) that moves the disk heads away from the end of the file is likely to have some impact on performance. I haven't done any tests to ascertain how noticeable such an impact is.

    John

  • Charles Kincaid (11/10/2009)


    Thanks James. I'm having a great deal of trouble seeing your script. Must be this silly IE 8.

    That sure did a horrible formatting job...

    here is the script again...without the formatting by SSC.

    --START SCRIPT

    SELECT * FROM #LogInfo

    WHERE ([CURRENT LSN]) >

    (

    SELECT TOP (1) [Current LSN]

    FROM #LogInfo

    WHERE [Checkpoint Begin] IS NOT NULL

    ORDER BY [Checkpoint Begin] DESC

    )

    --END SCRIPT

    Thanks,

    James

    ~ Without obstacles, you cannot progress ~
    http://sqln.blogspot.com/

  • John Mitchell-245523 (11/10/2009)


    Looking at it rationally rather than empirically, I'd say that since a log file is written to sequentially, any activity (such as truncation) that moves the disk heads away from the end of the file is likely to have some impact on performance.

    True, but unless the log is the sole and only file on the physical disk (no other tran logs, no data, no backups, etc) then the head will be moving to write the other files. Also a log backup will move the head, as will a transaction rollback (as far as I know) as both have to read the log.

    I'd just like to hear the reason behind the claim and any proof.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 15 posts - 46 through 60 (of 64 total)

You must be logged in to reply to this topic. Login to reply