Load data in chunks daily

  • I have to load table with around 50M rows but i get a window of 1hr daily so in that time frame,50M rows cant be inserted.I got the idea to load 10M today and then remaining 40M on subsequent days but the problem is i cant add any flag column(or any column) in my source table.How to implement this kind of logic?

  • may be the following can help
    importing large SQL files

  • jonas.gunnarsson 52434 - Thursday, March 23, 2017 7:36 AM

    may be the following can help
    importing large SQL files

    That link is for a MySQL solution. This is a Microsoft SQL Server forum.

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • mayank.pratap01 - Wednesday, March 22, 2017 11:46 PM

    I have to load table with around 50M rows but i get a window of 1hr daily so in that time frame,50M rows cant be inserted.I got the idea to load 10M today and then remaining 40M on subsequent days but the problem is i cant add any flag column(or any column) in my source table.How to implement this kind of logic?

    There may be a way to do it all within the given hour.  I need some information, though.

    1.  Is the clustered index on the target table temporal or otherwise ever-increasing?
    2.  Is the data you're inserting also temporally based and will be "appended" to the logical end of the clustered index?
    3.  Other than the clustered index, are there any other "unique" indexes on the target table?
    4.  Are there any FKs that point TO the target table?
    5.  Are there any FKs on the target table that point to other tables?
    6.  How many columns does the target table have?
    7.  Are you allowed to change the Recovery Model of the target database to BULK LOGGED?
    8.  Are you allowed to temporarily set a Trace Flag (610) on the target system?
    9.  Are the source and target tables on the same server instance?
    10.  Is the source data in the same order as the target clustered index?
    11. Are you allowed to set TABLOCK on the target table?
    12. Are you allowed to disable and rebuild all non-clustered indexes on the target table?

    Here's why I'm asking... and it still applies to 2016.
    https://technet.microsoft.com/en-us/library/dd425070(v=sql.100).aspx 

    If that can't be done in this case (magic INSERT/SELECT), I still need to know the answers to the other questions above because there may be another way using a temporary partition on the target table.

    Using a "magic" Insert/Select on an existing table during a demo on minimal logging, I was recently able to insert 7 Million rows of about 1,140 bytes for each row from one table to a target table in under 3 minutes with less than 50MB used in the log file with the clustered index in place.  That didn't include rebuilding any non-clustered indexes.  At that rate, you should be able to do the 50 million rows in something less than 22 minutes with time left over to rebuild some indexes.  If you have the "expensive" edition, you don't even have to worry about time overruns there because you can rebuild them online (although that makes the rebuilds a fair bit slower).

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply