SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


which way is the better for 1 million records inserting into one table


which way is the better for 1 million records inserting into one table

Author
Message
cooljagadeesh
cooljagadeesh
Valued Member
Valued Member (66 reputation)Valued Member (66 reputation)Valued Member (66 reputation)Valued Member (66 reputation)Valued Member (66 reputation)Valued Member (66 reputation)Valued Member (66 reputation)Valued Member (66 reputation)

Group: General Forum Members
Points: 66 Visits: 218
which way is the better for 1 million records inserting into one table

1) insert into destination
select * from source


2) BULK INSERT destination
FROM '\\computer\source.txt';

3) BCP
MyDoggieJessie
MyDoggieJessie
SSCertifiable
SSCertifiable (6.7K reputation)SSCertifiable (6.7K reputation)SSCertifiable (6.7K reputation)SSCertifiable (6.7K reputation)SSCertifiable (6.7K reputation)SSCertifiable (6.7K reputation)SSCertifiable (6.7K reputation)SSCertifiable (6.7K reputation)

Group: General Forum Members
Points: 6696 Visits: 7394
If the source data isn't too "wide", BCP would work quite well, but I'd recommend using SSIS so you can take more advantage of simultaneous/parallel threads

Take a quick look through these great articles:
Optimizing Bulk Import Performance
http://msdn.microsoft.com/en-us/library/ms190421(v=sql.105).aspx

The Data Loading Performance Guide
http://technet.microsoft.com/en-us/library/dd425070(SQL.100).aspx

We Loaded 1TB in 30 Minutes with SSIS, and So Can You
http://msdn.microsoft.com/en-us/library/dd537533(v=sql.100).aspx

______________________________________________________________________________
"Never argue with an idiot; They'll drag you down to their level and beat you with experience" ;-)
Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)

Group: General Forum Members
Points: 86978 Visits: 41107
That article on loading 1TB in 30 minutes seems to be impressive testament for SSIS but they had 4 instances of SSIS running 56 streams to a 64 processor database using 56 different tables (later, switched into a paritioned table) on a full hardware package (including a killer disk system with 165 spindles and 8GB fiber channels) that would have made the makers of Cray green with envy. From what I can see, SSIS didn't have anything to do with the performance. If they had executed 56 parallel streams using BULK INSERT, they likely would have gotten even better performance and they wouldn't have needed the extra 4 servers that SSIS lived on. ;-)

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
MyDoggieJessie
MyDoggieJessie
SSCertifiable
SSCertifiable (6.7K reputation)SSCertifiable (6.7K reputation)SSCertifiable (6.7K reputation)SSCertifiable (6.7K reputation)SSCertifiable (6.7K reputation)SSCertifiable (6.7K reputation)SSCertifiable (6.7K reputation)SSCertifiable (6.7K reputation)

Group: General Forum Members
Points: 6696 Visits: 7394
Fair point :-)

For ease of use, I'd go with BCP. Which would you recommend?

______________________________________________________________________________
"Never argue with an idiot; They'll drag you down to their level and beat you with experience" ;-)
Eirikur Eiriksson
Eirikur Eiriksson
SSCoach
SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)

Group: General Forum Members
Points: 15260 Visits: 18607
cooljagadeesh (3/27/2014)
which way is the better for 1 million records inserting into one table

1) insert into destination
select * from source


2) BULK INSERT destination
FROM '\\computer\source.txt';

3) BCP


Quick questions; is this a one off thing? Does it need to be managed, logged, reported? Are all files in the same format? Is speed more important than server load? w00t Or is there a file in the first place? Is it maybe a table to table load? Cool

The options you list imply both table and file source.
Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)

Group: General Forum Members
Points: 86978 Visits: 41107
MyDoggieJessie (3/28/2014)
Fair point :-)

For ease of use, I'd go with BCP. Which would you recommend?


If there are no "double-hop" or server trust problems for BULK INSERT, I'd go with BULK INSERT over BCP. That way, it can all be done from T-SQL without someone getting bent out of shape about using xp_CmdShell or having to do things from a command prompt or etc, etc. In the absence of xp_CmdShell, it's also a whole lot easier to make dynamic insofar as source and destination than using BCP. With that in mind, you can actually have a proc that does like they did in the 1TB/30Minute example insofar as parallel loads go. You can have a stored proc that creates multiple jobs and starts them all. Each job would poke a table with an "I'm done" marker that the main proc could check for completion. The jobs could be self-dropping or you could modify them from the main proc. Of course, such parallel loads don't do much for performance if your destination table/partitions are all on the same disk(s) regardless of how many files/filegroups you have. At best, I've only seen a nearly insignificant gain in load performance that way (same set of spindles) and I've also seen MUCH worse performance because the poor ol' read/write heads spend so much time thrashing on the same spindles.

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search