Load data in chunks daily

Question

Post reply

Load data in chunks daily

mayank.pratap01

SSC Veteran

Points: 206
More actions
March 22, 2017 at 11:46 pm

#319816

I have to load table with around 50M rows but i get a window of 1hr daily so in that time frame,50M rows cant be inserted.I got the idea to load 10M today and then remaining 40M on subsequent days but the problem is i cant add any flag column(or any column) in my source table.How to implement this kind of logic?

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply

jonas.gunnarsson 52434 Ten Centuries Points: 1240 More actions · Answer 1

jonas.gunnarsson 52434

Ten Centuries

Points: 1240

March 23, 2017 at 7:36 am

#1934507

may be the following can help
importing large SQL files

Alan Burstein SSC Guru Points: 61136 More actions · Answer 2

jonas.gunnarsson 52434 - Thursday, March 23, 2017 7:36 AM
may be the following can help
importing large SQL files

That link is for a MySQL solution. This is a Microsoft SQL Server forum.

"I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

-- Itzik Ben-Gan 2001

Jeff Moden SSC Guru Points: 1003863 More actions · Answer 3

mayank.pratap01 - Wednesday, March 22, 2017 11:46 PM
I have to load table with around 50M rows but i get a window of 1hr daily so in that time frame,50M rows cant be inserted.I got the idea to load 10M today and then remaining 40M on subsequent days but the problem is i cant add any flag column(or any column) in my source table.How to implement this kind of logic?

There may be a way to do it all within the given hour. I need some information, though.

1. Is the clustered index on the target table temporal or otherwise ever-increasing?
2. Is the data you're inserting also temporally based and will be "appended" to the logical end of the clustered index?
3. Other than the clustered index, are there any other "unique" indexes on the target table?
4. Are there any FKs that point TO the target table?
5. Are there any FKs on the target table that point to other tables?
6. How many columns does the target table have?
7. Are you allowed to change the Recovery Model of the target database to BULK LOGGED?
8. Are you allowed to temporarily set a Trace Flag (610) on the target system?
9. Are the source and target tables on the same server instance?
10. Is the source data in the same order as the target clustered index?
11. Are you allowed to set TABLOCK on the target table?
12. Are you allowed to disable and rebuild all non-clustered indexes on the target table?

Here's why I'm asking... and it still applies to 2016.
https://technet.microsoft.com/en-us/library/dd425070(v=sql.100).aspx

If that can't be done in this case (magic INSERT/SELECT), I still need to know the answers to the other questions above because there may be another way using a temporary partition on the target table.

Using a "magic" Insert/Select on an existing table during a demo on minimal logging, I was recently able to insert 7 Million rows of about 1,140 bytes for each row from one table to a target table in under 3 minutes with less than 50MB used in the log file with the clustered index in place. That didn't include rebuilding any non-clustered indexes. At that rate, you should be able to do the 50 million rows in something less than 22 minutes with time left over to rebuild some indexes. If you have the "expensive" edition, you don't even have to worry about time overruns there because you can rebuild them online (although that makes the rebuilds a fair bit slower).

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)