SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Fast ways to load data from SQL server to SQL server


Fast ways to load data from SQL server to SQL server

Author
Message
george_at_sql
george_at_sql
UDP Broadcaster
UDP Broadcaster (1.4K reputation)UDP Broadcaster (1.4K reputation)UDP Broadcaster (1.4K reputation)UDP Broadcaster (1.4K reputation)UDP Broadcaster (1.4K reputation)UDP Broadcaster (1.4K reputation)UDP Broadcaster (1.4K reputation)UDP Broadcaster (1.4K reputation)

Group: General Forum Members
Points: 1435 Visits: 1911
Hi
I have a table in SQL server which has over 5 million records and around 12 columns and wish to load it into a table in another SQL server.

There are no transformations of data and the structure of the table in source and destination is identical.

There are no indexes in the source table or have any column whereby i can split up the 5 million records into "partitions" in which case i could have parallelized the data load.

I tried using SSIS project with source as SQLServer and destination as SQLServer, however it is taking more than 3-4 hours and was erroring out after 1 million records.

After that i tried source SQLServer and destination as flatfile. It didnt error out however it was taking a lot of time(>5 hours) and i dont know if this is a feasible option in the actual production environment.
The idea is once the flat file is loaded i would import it using SSIS into the destination SQLServer table.

Any other options which are available that can speed up the process of data loading?
Eirikur Eiriksson
Eirikur Eiriksson
SSC Guru
SSC Guru (161K reputation)SSC Guru (161K reputation)SSC Guru (161K reputation)SSC Guru (161K reputation)SSC Guru (161K reputation)SSC Guru (161K reputation)SSC Guru (161K reputation)SSC Guru (161K reputation)

Group: General Forum Members
Points: 161532 Visits: 23597
george_at_sql - Tuesday, June 12, 2018 9:00 PM
Hi
I have a table in SQL server which has over 5 million records and around 12 columns and wish to load it into a table in another SQL server.

There are no transformations of data and the structure of the table in source and destination is identical.

There are no indexes in the source table or have any column whereby i can split up the 5 million records into "partitions" in which case i could have parallelized the data load.

I tried using SSIS project with source as SQLServer and destination as SQLServer, however it is taking more than 3-4 hours and was erroring out after 1 million records.

After that i tried source SQLServer and destination as flatfile. It didnt error out however it was taking a lot of time(>5 hours) and i dont know if this is a feasible option in the actual production environment.
The idea is once the flat file is loaded i would import it using SSIS into the destination SQLServer table.

Any other options which are available that can speed up the process of data loading?

How about using backup and restore?
Cool

george_at_sql
george_at_sql
UDP Broadcaster
UDP Broadcaster (1.4K reputation)UDP Broadcaster (1.4K reputation)UDP Broadcaster (1.4K reputation)UDP Broadcaster (1.4K reputation)UDP Broadcaster (1.4K reputation)UDP Broadcaster (1.4K reputation)UDP Broadcaster (1.4K reputation)UDP Broadcaster (1.4K reputation)

Group: General Forum Members
Points: 1435 Visits: 1911
Eirikur Eiriksson - Tuesday, June 12, 2018 10:21 PM
george_at_sql - Tuesday, June 12, 2018 9:00 PM
Hi
I have a table in SQL server which has over 5 million records and around 12 columns and wish to load it into a table in another SQL server.

There are no transformations of data and the structure of the table in source and destination is identical.

There are no indexes in the source table or have any column whereby i can split up the 5 million records into "partitions" in which case i could have parallelized the data load.

I tried using SSIS project with source as SQLServer and destination as SQLServer, however it is taking more than 3-4 hours and was erroring out after 1 million records.

After that i tried source SQLServer and destination as flatfile. It didnt error out however it was taking a lot of time(>5 hours) and i dont know if this is a feasible option in the actual production environment.
The idea is once the flat file is loaded i would import it using SSIS into the destination SQLServer table.

Any other options which are available that can speed up the process of data loading?

How about using backup and restore?
Cool

I will look into that option after asking the DBA.
Do you have any positive feedback on using linked server for large data transfer?

sgmunson
sgmunson
SSC Guru
SSC Guru (95K reputation)SSC Guru (95K reputation)SSC Guru (95K reputation)SSC Guru (95K reputation)SSC Guru (95K reputation)SSC Guru (95K reputation)SSC Guru (95K reputation)SSC Guru (95K reputation)

Group: General Forum Members
Points: 95604 Visits: 7179
The first question I'll ask on this is how much network bandwidth exists between the two servers? If for whatever reason, these servers are on 10/100 Mb network cards, that alone might be enough to cause you a ton of grief. You need to have gigabit network cards or fiber network cards to reduce the network latency. If you have any 10/100 switches or routers in between, then that hardware would become a bottleneck, and it would be far better to have gigabit network switches or routers. Just 5 million rows seems like a rather small number of rows to have such a long transfer time, so unless you have a bunch of varchar(max) or nvarchar(max) columns, there's something else going on. So question number two is: how much data does the table contain? You can right-click on the table from SSMS and choose Properties, then click on Storage. You'll see the number of rows and the amount of data space. Question three: What are the server's specs in terms of RAM and CPU ? Post back and chances are the true problem gets revealed.

Steve
‌(aka sgmunson)
Smile Smile Smile
Health & Nutrition
Luis Cazares
Luis Cazares
SSC Guru
SSC Guru (163K reputation)SSC Guru (163K reputation)SSC Guru (163K reputation)SSC Guru (163K reputation)SSC Guru (163K reputation)SSC Guru (163K reputation)SSC Guru (163K reputation)SSC Guru (163K reputation)

Group: General Forum Members
Points: 163764 Visits: 22795
Additional to what Steve just mentioned. You should also take into account that there's additional time when you run the SSIS from your machine as the data needs to travel to it before getting to the destination.
If network is slow, use bcp to load the data into a flat file in the source (delimited not fixed length please), copy the file to the destination and load the file from there. It might still be slow, but you won't be locking your tables for hours.


Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?


How to post data/code on a forum to get the best help: Option 1 / Option 2
Scott Coleman
Scott Coleman
SSC-Insane
SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)

Group: General Forum Members
Points: 24216 Visits: 2177
A single 5-million row INSERT can be a huge problem, it may fill up the transaction log even if the database is in simple recovery. A minimally-logged bcp insert is probably the best way to handle it as one file while minimizing this impact.

The process would run fastest if the flat file path is on a separate drive than the SQL data files on either server.

If the network is a serious bottleneck, you could try exporting to a file on the source server, copy the file to the destination, and import it from there. It is the same number of bytes transferred, but I have seen cases where Windows file transfer between servers uses the network more efficiently than SQL Server I/O.

You could split the source file into pieces in SSIS in spite of not having an index or key column. Create a set of flat file destinations, and use a source query of "SELECT *, FileNumber = (ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) % n) FROM table" (where n = number of files). Then use a Conditional Split component to distribute the rows by the FileNumber column. This requires one table scan at the source so it shouldn't make the export take longer, but you may have more luck loading multiple files rather than one big one. (You could also use a Conditional Split Script Transform, where you increment a counter and redirect the rows with relatively simple C# code. The number of files wouldn't have to be hardcoded in the script, it could simply use the number of outputs.)



Luis Cazares
Luis Cazares
SSC Guru
SSC Guru (163K reputation)SSC Guru (163K reputation)SSC Guru (163K reputation)SSC Guru (163K reputation)SSC Guru (163K reputation)SSC Guru (163K reputation)SSC Guru (163K reputation)SSC Guru (163K reputation)

Group: General Forum Members
Points: 163764 Visits: 22795
Scott Coleman - Wednesday, June 13, 2018 10:05 AM
A single 5-million row INSERT can be a huge problem, it may fill up the transaction log even if the database is in simple recovery. A minimally-logged bcp insert is probably the best way to handle it as one file while minimizing this impact.
FYI, SSIS does not generate a single insert when loading data to a table. Also, 5 million rows is not a huge insert that would take multiple hours. Even my laptop could handle that within a few minutes.



Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?


How to post data/code on a forum to get the best help: Option 1 / Option 2
Phil Parkin
Phil Parkin
SSC Guru
SSC Guru (217K reputation)SSC Guru (217K reputation)SSC Guru (217K reputation)SSC Guru (217K reputation)SSC Guru (217K reputation)SSC Guru (217K reputation)SSC Guru (217K reputation)SSC Guru (217K reputation)

Group: General Forum Members
Points: 217040 Visits: 24908
Luis Cazares - Wednesday, June 13, 2018 10:29 AM
Scott Coleman - Wednesday, June 13, 2018 10:05 AM
A single 5-million row INSERT can be a huge problem, it may fill up the transaction log even if the database is in simple recovery. A minimally-logged bcp insert is probably the best way to handle it as one file while minimizing this impact.
FYI, SSIS does not generate a single insert when loading data to a table. Also, 5 million rows is not a huge insert that would take multiple hours. Even my laptop could handle that within a few minutes.

I think it might, depending on the settings for 'Rows Per Batch' and 'Maximum Insert Commit Size'.



Help us to help you. For better, quicker and more-focused answers to your questions, consider following the advice in this link.

If the answer to your question can be found with a brief Google search, please perform the search yourself, rather than expecting one of the SSC members to do it for you.

Please surround any code or links you post with the appropriate IFCode formatting tags. It helps readability a lot.
Jeffrey Williams 3188
Jeffrey Williams 3188
SSC Guru
SSC Guru (78K reputation)SSC Guru (78K reputation)SSC Guru (78K reputation)SSC Guru (78K reputation)SSC Guru (78K reputation)SSC Guru (78K reputation)SSC Guru (78K reputation)SSC Guru (78K reputation)

Group: General Forum Members
Points: 78243 Visits: 11301
Luis Cazares - Wednesday, June 13, 2018 10:29 AM
Scott Coleman - Wednesday, June 13, 2018 10:05 AM
A single 5-million row INSERT can be a huge problem, it may fill up the transaction log even if the database is in simple recovery. A minimally-logged bcp insert is probably the best way to handle it as one file while minimizing this impact.
FYI, SSIS does not generate a single insert when loading data to a table. Also, 5 million rows is not a huge insert that would take multiple hours. Even my laptop could handle that within a few minutes.

Luis - it depends on the object used for the destination and the settings for that object. The default settings for the OLEDB Destination will create a single transaction...but you can control that using the fast load option (bulk insert) and setting appropriate batch/commit sizes.


Jeffrey Williams
Problems are opportunities brilliantly disguised as insurmountable obstacles.

How to post questions to get better answers faster
Managing Transaction Logs

george_at_sql
george_at_sql
UDP Broadcaster
UDP Broadcaster (1.4K reputation)UDP Broadcaster (1.4K reputation)UDP Broadcaster (1.4K reputation)UDP Broadcaster (1.4K reputation)UDP Broadcaster (1.4K reputation)UDP Broadcaster (1.4K reputation)UDP Broadcaster (1.4K reputation)UDP Broadcaster (1.4K reputation)

Group: General Forum Members
Points: 1435 Visits: 1911
sgmunson - Wednesday, June 13, 2018 6:39 AM
The first question I'll ask on this is how much network bandwidth exists between the two servers? If for whatever reason, these servers are on 10/100 Mb network cards, that alone might be enough to cause you a ton of grief. You need to have gigabit network cards or fiber network cards to reduce the network latency. If you have any 10/100 switches or routers in between, then that hardware would become a bottleneck, and it would be far better to have gigabit network switches or routers. Just 5 million rows seems like a rather small number of rows to have such a long transfer time, so unless you have a bunch of varchar(max) or nvarchar(max) columns, there's something else going on. So question number two is: how much data does the table contain? You can right-click on the table from SSMS and choose Properties, then click on Storage. You'll see the number of rows and the amount of data space. Question three: What are the server's specs in terms of RAM and CPU ? Post back and chances are the true problem gets revealed.

You are right, there are varchar(max) columns in that table.

I am trying to get a query to get me the space occupied by a table, since we have restrictive access to the source database i get an error while trying the right click option.

The network i think would be the regular 10/100 Mbit one since its not a high end server of any kind being a development database.

Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum








































































































































































SQLServerCentral


Search