SSIS and Transaction Log Growth

  • We have this scenario and fixed it by creating a Staging database in simple mode next to the DW. fast loads into staging require minimal logging. staging is backed up weekly since we really only need the schema to rebuild it should something go terribly wrong. Jobs can truncate tables in the Staging DB, but not in the DW.

  • Nice QOTD Craig, one of the best we've had for a while I think.

  • IanSinclair (3/5/2014)


    And if the Maximum Insert Commit Size in the OLEDB destination is already set to a reasonable value you.....

    This option might not work - so plitting a source stream into a few hundred streams might be unwieldy but necessary.

    You'll have to explain to me why that won't work, I'm afraid I'm not familiar with the problem you didn't explain, just hinted at without explanation.

    Stewart "Arturius" Campbell (3/5/2014)


    Alternatively, change the recovery model to bulk logged just before the load and back on completion thereof.

    however, this will require a full backup before and after.

    Bulk logged reduces log space usage by using minimal logging for most bulk operations.

    see Prerequisites for Minimal Logging in Bulk Import for more information in this regard

    good question, thanks Craig

    Bulk and Simple use similar logging techniques for minimally logged transactions, so either would work for that purpose. In my experience, while this will help, it requires a significant exposure of point in time recovery loss (hours, remember) that would be unacceptable for most systems. Edit: I should have also mentioned that in my explanation below but that was starting to get REALLY long. You're right, though, it would HELP. I'm not sure it would fix it however. Depends on where the bulk log was in the log when backups tried to run. /EDIT

    Koen Verbeeck (3/5/2014)


    Nevyn (3/5/2014)


    Koen Verbeeck (3/5/2014)


    Interesting question, but I do not agree entirely with the answer.

    Setting the commit size to a lower value will commit batches before the entire load is done. What if something fails and I want to roll everything back?

    Splitting the load up in smaller chunks but in manageable transactions seems to make more sense.

    Well, the destination in this case IS a staging table. If you want to roll everything back, you can just truncate it again, no?

    Point taken 😀

    Heh, yup, that was part of the reason I made sure to mention this went to a staging structure which was the concern. Koen just needs to have more morning coffee. 🙂

    Thanks for the comments guys, keep 'em coming. I'm human, I could have made a mistake in this somewhere. :hehe:


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Evil Kraig F (3/5/2014)


    Koen just needs to have more morning coffee. 🙂

    ... as always :rolleyes:

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • Stewart "Arturius" Campbell (3/5/2014)


    Alternatively, change the recovery model to bulk logged just before the load and back on completion thereof.

    however, this will require a full backup before and after.

    Bulk logged reduces log space usage by using minimal logging for most bulk operations.

    see Prerequisites for Minimal Logging in Bulk Import for more information in this regard

    good question, thanks Craig

    I'm no SSIS expert so can we assume this insert will be in the form of a bulk import and not just a T_SQL insert statement?

    ---------------------------------------------------------------------

  • and 3 points worth - nice. 🙂

    ---------------------------------------------------------------------

  • george sibbald (3/5/2014)


    Stewart "Arturius" Campbell (3/5/2014)


    Alternatively, change the recovery model to bulk logged just before the load and back on completion thereof.

    however, this will require a full backup before and after.

    Bulk logged reduces log space usage by using minimal logging for most bulk operations.

    see Prerequisites for Minimal Logging in Bulk Import for more information in this regard

    good question, thanks Craig

    I'm no SSIS expert so can we assume this insert will be in the form of a bulk import and not just a T_SQL insert statement?

    Yes, if you meet all the requirements in the linked article. Otherwise, it is a logged insert just like a TSQL insert.

  • yes, but the question says insert, rather than import, so does SSIS run the insert as a bulk import by default, or is that a setting you can choose..................just trying to expand my depth of knowledge here 🙂

    ---------------------------------------------------------------------

  • george sibbald (3/5/2014)


    yes, but the question says insert, rather than import, so does SSIS run the insert as a bulk import by default, or is that a setting you can choose..................just trying to expand my depth of knowledge here 🙂

    It's an option, called Fast Load in the interface.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • nice question..

    thanks craig.

  • Stewart "Arturius" Campbell (3/5/2014)


    Alternatively, change the recovery model to bulk logged just before the load and back on completion thereof.

    however, this will require a full backup before and after.

    Bulk logged reduces log space usage by using minimal logging for most bulk operations.

    see Prerequisites for Minimal Logging in Bulk Import for more information in this regard

    good question, thanks Craig

    Bulk and Simple use similar logging techniques for minimally logged transactions, so either would work for that purpose. In my experience, while this will help, it requires a significant exposure of point in time recovery loss (hours, remember) that would be unacceptable for most systems. Edit: I should have also mentioned that in my explanation below but that was starting to get REALLY long. You're right, though, it would HELP. I'm not sure it would fix it however. Depends on where the bulk log was in the log when backups tried to run. /EDIT

    Bulk and simple recovery is not same if you put db in simple mode and if any transaction happens than LSN chain breaks and you must need full db backup immediately elase your t-log backup fails. However in Bulk recovery it won't break LSN chain and only issue is with point-in-time recovery.

    ---------------------------------------------------
    "Thare are only 10 types of people in the world:
    Those who understand binary, and those who don't."

  • free_mascot (3/5/2014)


    Bulk and Simple use similar logging techniques for minimally logged transactions, so either would work for that purpose. In my experience, while this will help, it requires a significant exposure of point in time recovery loss (hours, remember) that would be unacceptable for most systems. Edit: I should have also mentioned that in my explanation below but that was starting to get REALLY long. You're right, though, it would HELP. I'm not sure it would fix it however. Depends on where the bulk log was in the log when backups tried to run. /EDIT

    Bulk and simple recovery is not same if you put db in simple mode and if any transaction happens than LSN chain breaks and you must need full db backup immediately elase your t-log backup fails. However in Bulk recovery it won't break LSN chain and only issue is with point-in-time recovery.

    You're acting like I said that the log chain wasn't going to be broken in simple mode. I said that the exposure time for point in time recovery was too high for most locations. That they use similar bulk logging techniques is a different piece of what I said.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Finally, I can say.....easy peasy!

  • Koen Verbeeck (3/5/2014)


    Interesting question, but I do not agree entirely with the answer.

    Setting the commit size to a lower value will commit batches before the entire load is done. What if something fails and I want to roll everything back?

    Splitting the load up in smaller chunks but in manageable transactions seems to make more sense.

    That was my reaction too on discovering that splitting the stream into small chunks was not the "corect" answer. when I split the thing, I know where it got to before any failure; when I use a mgic number under teh covers I have some working out to do to find out how to recover. I believe the latter is more complicated than teh former, so that the "corect" answe is not by any means the simplest way to achive the objective except in that extremely rare case (for most of us it mever happens) where coping with failure is not required for some reason.

    Tom

  • Nevyn (3/5/2014)


    Koen Verbeeck (3/5/2014)


    Interesting question, but I do not agree entirely with the answer.

    Setting the commit size to a lower value will commit batches before the entire load is done. What if something fails and I want to roll everything back?

    Splitting the load up in smaller chunks but in manageable transactions seems to make more sense.

    Well, the destination in this case IS a staging table. If you want to roll everything back, you can just truncate it again, no?

    Yes, but that's potentially a big performance hit if you have to truncate the table and then insert several hundred million rows that had already been inserted correctly instead of just teh chunk that failed and all following chunks.

    Tom

Viewing 15 posts - 16 through 30 (of 32 total)

You must be logged in to reply to this topic. Login to reply