Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase

which way is the better for 1 million records inserting into one table Expand / Collapse
Author
Message
Posted Thursday, March 27, 2014 9:10 PM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Sunday, April 20, 2014 10:08 PM
Points: 30, Visits: 158
which way is the better for 1 million records inserting into one table

1) insert into destination
select * from source


2) BULK INSERT destination
FROM '\\computer\source.txt';

3) BCP
Post #1555745
Posted Thursday, March 27, 2014 9:28 PM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Today @ 7:13 AM
Points: 3,893, Visits: 7,137
If the source data isn't too "wide", BCP would work quite well, but I'd recommend using SSIS so you can take more advantage of simultaneous/parallel threads

Take a quick look through these great articles:
Optimizing Bulk Import Performance
http://msdn.microsoft.com/en-us/library/ms190421(v=sql.105).aspx

The Data Loading Performance Guide
http://technet.microsoft.com/en-us/library/dd425070(SQL.100).aspx

We Loaded 1TB in 30 Minutes with SSIS, and So Can You
http://msdn.microsoft.com/en-us/library/dd537533(v=sql.100).aspx


______________________________________________________________________________
"Never argue with an idiot; They'll drag you down to their level and beat you with experience"
Post #1555748
Posted Thursday, March 27, 2014 10:56 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Yesterday @ 9:18 PM
Points: 36,938, Visits: 31,441
That article on loading 1TB in 30 minutes seems to be impressive testament for SSIS but they had 4 instances of SSIS running 56 streams to a 64 processor database using 56 different tables (later, switched into a paritioned table) on a full hardware package (including a killer disk system with 165 spindles and 8GB fiber channels) that would have made the makers of Cray green with envy. From what I can see, SSIS didn't have anything to do with the performance. If they had executed 56 parallel streams using BULK INSERT, they likely would have gotten even better performance and they wouldn't have needed the extra 4 servers that SSIS lived on.

--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1555763
Posted Friday, March 28, 2014 1:06 AM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Today @ 7:13 AM
Points: 3,893, Visits: 7,137
Fair point

For ease of use, I'd go with BCP. Which would you recommend?


______________________________________________________________________________
"Never argue with an idiot; They'll drag you down to their level and beat you with experience"
Post #1555782
Posted Friday, March 28, 2014 1:57 AM
SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Today @ 7:00 AM
Points: 1,532, Visits: 4,240
cooljagadeesh (3/27/2014)
which way is the better for 1 million records inserting into one table

1) insert into destination
select * from source


2) BULK INSERT destination
FROM '\\computer\source.txt';

3) BCP


Quick questions; is this a one off thing? Does it need to be managed, logged, reported? Are all files in the same format? Is speed more important than server load? Or is there a file in the first place? Is it maybe a table to table load?

The options you list imply both table and file source.
Post #1555795
Posted Friday, March 28, 2014 10:55 AM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Yesterday @ 9:18 PM
Points: 36,938, Visits: 31,441
MyDoggieJessie (3/28/2014)
Fair point

For ease of use, I'd go with BCP. Which would you recommend?


If there are no "double-hop" or server trust problems for BULK INSERT, I'd go with BULK INSERT over BCP. That way, it can all be done from T-SQL without someone getting bent out of shape about using xp_CmdShell or having to do things from a command prompt or etc, etc. In the absence of xp_CmdShell, it's also a whole lot easier to make dynamic insofar as source and destination than using BCP. With that in mind, you can actually have a proc that does like they did in the 1TB/30Minute example insofar as parallel loads go. You can have a stored proc that creates multiple jobs and starts them all. Each job would poke a table with an "I'm done" marker that the main proc could check for completion. The jobs could be self-dropping or you could modify them from the main proc. Of course, such parallel loads don't do much for performance if your destination table/partitions are all on the same disk(s) regardless of how many files/filegroups you have. At best, I've only seen a nearly insignificant gain in load performance that way (same set of spindles) and I've also seen MUCH worse performance because the poor ol' read/write heads spend so much time thrashing on the same spindles.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1556054
« Prev Topic | Next Topic »

Add to briefcase

Permissions Expand / Collapse