Need help in transfering 400 million from SQL Server to SQL Server

  • Hi,

    I have to transfer 400 million (with 8 columns) from SQL Server to SQL Server.I planned to do this in three ways.

    1.

    Creating an SSIS Pkg and calling it in SQL Agent Job Job step.

    ---or---

    2.

    "

    insert into target_tbl

    select columns from [linkserver].database.source_tbl

    "

    using above sql statement in SQL Job Agent Job step.

    ---or---

    3

    "

    Drop table target_tbl

    select columns into target_tbl from [linkserver].database.source_tbl

    "

    using above sql statement into SQL Job Agent Job step.

    Please suggest me which one gives the best performance and why?

    Thanks in advance...

    Regards,

    Rocky

  • I would recommend the SSIS approach - simply for the fact that you can then control the batch and commit sizes on the OLE DB Destination (using the fast load option).

    Doing this - you can actually improve the performance over a linked server. Using a linked server will require the complete statement to be completed successfully - and then it will commit it to the database. This will require the space to be available in the transaction log to handle the 400 million rows to be inserted.

    Using SSIS and defining a reasonable batch size (200000 would be a good starting point) - and a reasonable commit size (200000 would also be a good start here), you then only need enough space in the transaction log to account for the 200,000 rows. This assumes that your destination database is in simple recovery model. If the destination database is in full recovery model - you can add a step in the SSIS package to kick off the transaction log backup for that database.

    I would also recommend that you don't perform a delete - rather, use truncate instead to clear the table before inserting the data if you are going to be performing a full refresh of the data.

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • Thank you Jeffrey

  • I'd probably not use any of those methods especially if it's a one off task. I'd most likely do a native BCP out on the source and a native BCP in on th destination.

    No matter what you do, make sure that it does it in batches and that you have transaction log backups running at pretty close intervals to keep the log file from blowing up if you're in the full recovery mode.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Just to add to what Jeffrey said, if performance is of primary concern and you have a little time to tune the SSIS you can play around with adjusting the SSIS buffer sizes on your Data Flow to get maximum throughput. Reference: DefaultBufferMaxRows and DefaultBufferMaxSize

    Adjust buffer size in SSIS data flow task by SQL Server Performance Team

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply