How to reduce log file size in BCP in?

  • I used BCP utility to load data from a flat file to MS Server 2005. The flat file has about 15 GB in size and has 443 millions of rows. The database I used had SIMPLE recovery mode. However, when loading data, I found that the log file of this database increased to 210 GB!

    Is any good approach to reduce the log file size while loading data using BCP?

    Many thanks in advance for any input.

  • What size batch file are you using?

  • By default, BCP loads everything in one transaction, so your transaction was 443 million rows and your transaction log file had to grow to hold that.

    Use the BCP batch size parameter to limit the size of the transaction to a much smaller number, like say 100,000 to 1,000,000 rows. You should read about the BCP utility parameters in SQL Server Book Online.

    BCP -b 100000

  • drop all indexes then BCP in data (fast bcp). After BCP is done recreate the indexes.

  • drop all indexes then BCP in data (fast bcp). After BCP is done recreate the indexes

    This does not reduce the size of the transaction log. As was previously stated, the transactions performed by bcp are logged. In order to speed up the load as well as prevent the t-log from growing by leaps and bounds use the -b flag.

    Your command should look like:

    bcp <databasename>..<tablename> in <filename> <-t (if text) or -n (if native)> -b<number of records to batch, I usually use 1000)> -S <servername> -U <user> -P <password>

    The -b flag is the key. The larger the batch, the longer it takes to perform the write as well as the more T-Log space used.

    I recently loaded 414 M rows in under 2 hrs with this command string (using native).

    Steve Jimmo
    Sr DBA
    “If we ever forget that we are One Nation Under God, then we will be a Nation gone under." - Ronald Reagan

  • Using -b is not entirely correct, the load is still logged, however the log file will be reused for each batch if you are using the simple recovery model.

    To achieve a minimally logged load you must ensure the following:

    1. You are using simple or bulk logged recovery.

    2. The target table is not being replicated.

    3. The target table has no triggers.

    4. The target table must be empty OR have no indexes (clustered & nonclustered).

    5. The TABLOCK hint is specified. (For bcp use -h TABLOCK)

    I have just loaded 300 million rows into a non-empty table without using -b in 4 minutes.

    The SQL Guy @ blogspot[/url]

    @SeanPearceSQL

    About Me[/url]

  • switch from simple to bulk-logged recovery mode ...

    MCITP: Database Administrator 2005
    MCTS SQL Server 2008
    MCP SQL 2012/2014
    MCSA SQL Server 2012/2014
    MCSE Data Management and Analytics

  • Feeg (4/23/2012)


    switch from simple to bulk-logged recovery model ...

    ?

    Jared
    CE - Microsoft

  • Bulk Logged may even generate a larger transaction log ...

    ... unless you couple it with trace flag

    RegardsRudy KomacsarSenior Database Administrator"Ave Caesar! - Morituri te salutamus."

  • from something Microsoft gave us:

    1.During Data loads; it is recommended to set the database’s recovery model to “Bulk Logged” and enable Trace Flag 610 to decrease load times. This is a recommended but not required best practice.

    Trace flag 610 controls minimally logged insert into indexed tables. When the bulk load operation causes a new page to be allocated, all of the rows sequentially filling that new page are minimally logged. (See the SQL Server Technical Article http://msdn.microsoft.com/en-us/library/dd425070(v=SQL.100).aspx for additional information.)

    RegardsRudy KomacsarSenior Database Administrator"Ave Caesar! - Morituri te salutamus."

  • :blink: will read up thanks

    MCITP: Database Administrator 2005
    MCTS SQL Server 2008
    MCP SQL 2012/2014
    MCSA SQL Server 2012/2014
    MCSE Data Management and Analytics

  • Just to follow up a bit even though 5 years later, if done correctly and in a single batch, you can BCP all those rows into a table that has a clustered index in place if you follow the conditions in the article that Rudy made reference to and it will be minimally logged even without setting TF 610.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • This was removed by the editor as SPAM

  • JasonClark - Sunday, March 26, 2017 11:10 PM

    Use -b option to Avoid filling the transaction log space. Also, you should drop indexes and triggers to avoid logging data insert.

    By itself, that won't do it.  If you're in the full recovery model, the log file can become quite large during a large import even if you do TRN backups every 15 minutes.
    If you truly want to minimize the use of the log file during BCP imports, which will also help make them much faster, please see the following article, which is still valid today.
    https://technet.microsoft.com/en-us/library/dd425070(v=sql.100).aspx

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply