Incremental Table Copy

Question

Incremental Table Copy

sizal0234

SSCrazy

Points: 2272
More actions
November 16, 2018 at 6:14 pm

#380877

Hi,
I have 3 huge table in excess of 100 GB each which need two be copied between two SQL 2017 Instances. Replication is not an option because of proximity and other limitations. If I write a logic to copy the tables over a linked server, I need some sort of logic then to just to incremental data transfers to reduce data load time. What are the recommended options? One of the tables doesn't have a unique key or timestamp.

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply

Jason A. Long SSC-Insane Points: 23781 More actions · Answer 1

sizal0234 - Friday, November 16, 2018 6:14 PM
Hi,
I have 3 huge table in excess of 100 GB each which need two be copied between two SQL 2017 Instances. Replication is not an option because of proximity and other limitations. If I write a logic to copy the tables over a linked server, I need some sort of logic then to just to incremental data transfers to reduce data load time. What are the recommended options? One of the tables doesn't have a unique key or timestamp.

No unique key or timestamp... I'd say either use CDC (change data capture) or create a trigger that writes new data to a separate table. Then create a process that sends all of the data from the "capture table" to the destination servers and then truncates the capture table. That way you never have to read from big tables and there's never any doubt about what needs to be sent.

Jeffrey Williams SSC Guru Points: 90257 More actions · Answer 2

sizal0234 - Friday, November 16, 2018 6:14 PM
Hi,
I have 3 huge table in excess of 100 GB each which need two be copied between two SQL 2017 Instances. Replication is not an option because of proximity and other limitations. If I write a logic to copy the tables over a linked server, I need some sort of logic then to just to incremental data transfers to reduce data load time. What are the recommended options? One of the tables doesn't have a unique key or timestamp.

How would you identify rows to be inserted/updated if you don't have a unique key? I would also be concerned about using a linked server in this scenario - because the linked server will either pull all data from the remote table (because no unique key) or will default to a cursor based solution (one row at a time). If you cannot utilize replication because of proximity - I am assuming that is because each system is in different locations - which would preclude using a linked server even more...

I would look at SSIS to extract and load the data - or a BCP out to a file and BCP in to the destination. I prefer SSIS as it allows for extracting and loading to occur as the rows are passed through the pipeline whereas with BCP you have to complete the full extract before you can begin the load process.

Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

― Charles R. Swindoll

How to post questions to get better answers faster
Managing Transaction Logs

Jeff Moden SSC Guru Points: 1003894 More actions · Answer 3

If the source and destination are on the same SAN, some SANs have the option (and sometimes it's a paid option) to do "SAN Replication", which is nasty fast.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

moreno Valued Member Points: 71 More actions · Answer 4

May be you can use this metod, expecialy if your table have only append activity
create parallel tables with same structure .
Create a trigger "after insert" on two original tables that insert same records on cloned tables
The copy process move records from colned tables to remote site ( copy records and truncate tables

vinodnarwal2008 SSC Rookie Points: 28 More actions · Answer 5

moreno - Sunday, November 18, 2018 4:34 AM

Trigger would consume cpu and increase worker thread if parallel connection insert data in table.
Identify primary key or add a column row insertion date to uniquely identify daily data. Write dynamic code to get the mkn and max values to avoid data inconsistencies.

alex.sqldba SSChampion Points: 10254 More actions · Answer 6

Seconding Jeff on the SAN Replication if its available.

I've used that for almost exactly this kind fo work and it was AWESOME.

sizal0234 SSCrazy Points: 2272 More actions · Answer 7

Thank you all for responses.I managed to get the unique keys identified for these tables and tested CDC. This has worked well so far during testing.