Incremental Table Copy

  • Hi,
    I have 3 huge table in excess of 100 GB each which need two be copied between two SQL 2017 Instances. Replication is not an option because of proximity and other limitations. If I write a logic to copy the tables over a linked server, I need some sort of logic then to just to incremental data transfers to reduce data load time. What are the recommended options?  One of the tables doesn't have a unique key or timestamp.

  • sizal0234 - Friday, November 16, 2018 6:14 PM

    Hi,
    I have 3 huge table in excess of 100 GB each which need two be copied between two SQL 2017 Instances. Replication is not an option because of proximity and other limitations. If I write a logic to copy the tables over a linked server, I need some sort of logic then to just to incremental data transfers to reduce data load time. What are the recommended options?  One of the tables doesn't have a unique key or timestamp.

    No unique key or timestamp... I'd say either use CDC (change data capture) or create a trigger that writes new data to a separate table. Then create a process that sends all of the data from the "capture table" to the destination servers and then truncates the capture table. That way you never have to read from big tables and there's never any doubt about what needs to be sent.

  • sizal0234 - Friday, November 16, 2018 6:14 PM

    Hi,
    I have 3 huge table in excess of 100 GB each which need two be copied between two SQL 2017 Instances. Replication is not an option because of proximity and other limitations. If I write a logic to copy the tables over a linked server, I need some sort of logic then to just to incremental data transfers to reduce data load time. What are the recommended options?  One of the tables doesn't have a unique key or timestamp.

    How would you identify rows to be inserted/updated if you don't have a unique key?  I would also be concerned about using a linked server in this scenario - because the linked server will either pull all data from the remote table (because no unique key) or will default to a cursor based solution (one row at a time).  If you cannot utilize replication because of proximity - I am assuming that is because each system is in different locations - which would preclude using a linked server even more...

    I would look at SSIS to extract and load the data - or a BCP out to a file and BCP in to the destination.  I prefer SSIS as it allows for extracting and loading to occur as the rows are passed through the pipeline whereas with BCP you have to complete the full extract before you can begin the load process.

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • If the source and destination are on the same SAN, some SANs have the option (and sometimes it's a paid option) to do "SAN Replication", which is nasty fast.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • May be you can use this metod, expecialy if your table have only append activity
    create parallel tables with same structure .
    Create a trigger "after insert" on two original tables that insert same records on cloned tables
    The copy process move records from colned tables to remote site ( copy records and truncate tables

  • moreno - Sunday, November 18, 2018 4:34 AM

    Trigger would consume cpu and increase worker thread if parallel connection insert data in table. 
    Identify primary key or add a column row insertion date to uniquely  identify daily data. Write dynamic code to get the mkn and max values to avoid data inconsistencies. 

  • Seconding Jeff on the SAN Replication if its available.

    I've used that for almost exactly this kind fo work and it was AWESOME.

  • Thank you all for responses.I managed to get the unique keys identified for these tables and tested CDC. This has worked well so far during testing.

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply