SSIS Rows per Batch and Max Insert Commit Size

  • I have an SSIS package which is copying 5 large tables daily. There is no WHERE logic in these data pulls, every record is selected and the source tables will only keep growing.

    Currently, we are now at the point where the SSIS package causes the Transaction Log on the target SQL Server to run out of space.

    I noted that the OLEDB Destination connections for the package currently have the default Rows per Batch, and Max Insert Commit Size set (Data Access is Fast Load).  My understanding is that this means that SSIS will attempt to load all rows in a single transaction which has now exceeded the max available log space.

    I've read that if I change the Max Insert Commit Size to a smaller number, SQL Server will commit each transaction after that many rows which "...may help with transaction log and temp db size"

    My questions:  If I change this, will the log still keep growing unless I take a log backup during the data load so that the already committed transactions of smaller batches are cleared from the log?

    Also, do I need to match the rows per batch to the max insert commit size?

    Basically I need to load this data but reduce the trans log usage so it doesn't grow past it's imposed hard cap.  Any advice appreciated.

    I can't put the target DB into Simple Recovery as the target DB is participating in a HA group.

    • This topic was modified 19 hours, 24 minutes ago by  planetmatt.
  • Thanks for posting your issue and hopefully someone will answer soon.

    This is an automated bump to increase visibility of your question.

  • Fast load with table lock will attempt to insert with minimal logging, but there are requirements of the destination table. I believe that even in a full recovery model the insert can be minimally logged if:

    1. The table is empty and has only a clustered index or is a heap.
    2. The table is not empty and has no clustered index.

    If there are non-clustered indexes then the insert will be fully logged. I don't know about the transaction log, but it seems likely it will keep growing whatever the batch size. If you can make the insert minimally logged it might not matter.  This link explains the requirements and shows the differences between full and bulk logged/simple recovery models. https://www.mssqltips.com/sqlservertip/1185/minimally-logging-bulk-load-inserts-into-sql-server/

  • So, you're doing a wholesale replacement of all the rows in the target tables every night?

    If that's so, why not create a second database that's in the SIMPLE recovery model so that you're not constantly backing up a wad of rows that haven't changed and won't change in the future.  Obviously, it'll also help you big time with things like minimal logging, etc.  It'll also help you with setting up a "Swap'n'Drop" system the repoints synonyms so that if the transfer fails, you're not actually out of business and can continue to use the previous day's data.

    And your HA won't take a beating every day because of all the data you've changed even though most of it will not have changed.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • To the specific question - yes, setting a commit size and batch size will reduce the impact on the transaction log and allow the log to be truncated.  Whether or not that actually works in your situation will depend on how often you run log backups and how much data is loaded between each log backup.

    I generally start with the same size for both commit and batch sizes - and adjust if needed.  That value will depend on the size of the database - size of the transaction log and amount of data being extracted and loaded.  Since I don't normally load data into a database in full recovery or part of an AG - I don't usually need to worry about log backups.  But - if this is high-volume then I would modify the log backups to run every couple of minutes during the load.

    Jeffrey Williams
    Problems are opportunities brilliantly disguised as insurmountable obstacles.

    How to post questions to get better answers faster
    Managing Transaction Logs

  • Jeffrey Williams wrote:

    To the specific question - yes, setting a commit size and batch size will reduce the impact on the transaction log and allow the log to be truncated.  Whether or not that actually works in your situation will depend on how often you run log backups and how much data is loaded between each log backup.

    I generally start with the same size for both commit and batch sizes - and adjust if needed.  That value will depend on the size of the database - size of the transaction log and amount of data being extracted and loaded.  Since I don't normally load data into a database in full recovery or part of an AG - I don't usually need to worry about log backups.  But - if this is high-volume then I would modify the log backups to run every couple of minutes during the load.

    It may also make the impact on the log file much greater because anything after the first insert will be more than what "Minimally Logging" would impart.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Jeff Moden wrote:

    It may also make the impact on the log file much greater because anything after the first insert will be more than what "Minimally Logging" would impart.

    Hi Jeff,

    Can you expand on this answer a bit please?

    Are you saying that if you set a commit size limit, only the first batch is minimally logged?

Viewing 7 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply