Minimal Logging

  • As you know, the essential difference with minimal logging is that SQL Server does not log individual rows - it logs at the page and extent level.

    For log entries generated by inserts to a clustered index, minimal logging means you will not see as many transaction log entries with Operation = N'LOP_INSERT_ROWS' and Context = N'LCX_CLUSTERED'. In general, you will not see any of these for minimally-logged inserts to a non-empty b-tree where the inserted rows go to newly allocated pages. This requires TF 610 or SQL Server 2016+ where that behaviour is default.

    Minimal logging does not necessarily mean that less physical log space is used, because log reservation is conservative. As the name log reservation suggests, there must be enough physical log space to ensure rollback would succeed (writing compensation records and so on). Some operations during minimally logged inserts to a populated b-tree have a very high log reservation (technically unnecessary, but safe). This means that if you start with a very small log (not best practice), the log may have to grow to honour the log reservation requirement. Minimal logging will still, generally speaking, write less log (in terms of log record size, and the quantity of log records generated).

    For those reasons, measuring the size of the log file is not the right metric. You could instead compare the number and SUM ([Log Record Length]) of log records generated using sys.fn_dblog. Note that this excludes the log reservation size. Minimal logging is not guaranteed to be 'better' for performance than full logging in all scenarios - there are trade-offs. Having the log file pre-sized appropriately to the maximum expected usage is good practice, and will help show any performance differences by avoiding the log growths that tend to dominate.

    Finally: Minimal logging into a populated b-tree is not exactly the same as minimal logging into an empty b-tree. Inserts into an empty b-tree are still much more efficient in many ways. The point of minimal logging into a non-empty b-tree is that it logs fewer records of smaller size (excepting conservation reservation) than full logging would, and generates fewer log flushes. You can see the difference by disabling TF 610 (or enabling TF 692 on SQL Server 2016+). The fair comparison to make is between inserts to a populated b-tree with minimal logging enabled versus inserts to a populated b-tree with it not enabled. In particular, minimal logging into a populated b-tree generates more log records with a high reservation requirement.

    Using Jeff's test rig with a log pre-sized to avoid growths (and without the log shrinks) I measured the log record count and record length sum using:

    SELECT
        Num = COUNT_BIG(*),
        Size = SUM(FD.[Log Record Length])
    FROM sys.fn_dblog(NULL, NULL) AS FD;

    For the second insert I found 757,645 records of total size 60,078,616 generated using minimal logging, and 1,460,123 records of total size 623,933,640 when minimal logging was disabled.

  • @ Paul White: Thank you! Everything makes perfect sense to me now. I confirmed the behavior you describe with my own tests, including 55% time savings from minimal logging into a table with a clustered index and data compared to full logging when the logfile is properly sized.

    Obviously, I want to have my cake and eat it. So I want fast inserts from minimal logging with a clustered index and data in the table, but also a small logifile.

    I can achieve that with fairly simple batching. In my test I broke the big insert into 10 batches. That only cost 17% more runtime, but it saved 80-90% of needed logfile size. That is a very workable tradeoff for me.

    Than you for invaluable help to Paul as well as Jeff.

    Thomas

  • As always, Paul, your technical knowledge is impeccable.  Thank you very much.

    The unfortunate part is that which Thomas realized in his testing isn't clearly explained in the MS documentation.  In fact, the MS documentation states that doing things in smaller batches is not the way to go.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 3 posts - 31 through 32 (of 32 total)

You must be logged in to reply to this topic. Login to reply