Sync tables from different server with million of records

  • Hello,
    I need to sync 2 tables from different server. Both table contains about 1055088770 rows.About 70% records are sync but i need to sync rest 30% rows (update, insert or delete based on source). I need to do in a loop with smaller chunk so it wont fill up log space. I may use SQL red gate or other third party tool but this will create very large script within one update and commit transaction and it will fill up log space.
    Any custom script will be great help.

  • There's not nearly enough info here to make a good recommendation. 

    Is the sync one way, or both ways?
    For example, ServerA  has records 1,2,3, and 5.  ServerB contains records 1, 2, 3, and 4. Does row 4 need to go to ServerA, and row 5 need to go to ServerB?
    OR
    ServerA has 1, 2, 3, and 4, ServerB has 1, 2, 3.  Record 4 needs to go to ServerB.

    Are the structures of the tables the same?

    How do you determine what rows need to go where?  

    Here's one thought, which may work for both scenarios. Again, this is a semi-educated guess based upon the info provided. 
    1. Create a table on both servers that consists of the primary key to the affected table.
    2. Create a view that selects the primary key.  
    3. BCP OUT the results of the view from ServerA, BCP OUT the results of the view from ServerB
    4. BCP IN to the new table on each server. 
    5. Create a second view that selects the records from the table where they do not exist in the primary key table.
    6. BCP OUT the results from the view.
    7. BCP IN to the other server.

    BCP (or bulk insert) is a minimally logged operation, which should prevent the transaction log from filling up.  I suspect it will also be far faster.

    Michael L John
    If you assassinate a DBA, would you pull a trigger?
    To properly post on a forum:
    http://www.sqlservercentral.com/articles/61537/

  • Michael L John - Wednesday, August 15, 2018 8:03 AM

    There's not nearly enough info here to make a good recommendation. 

    Is the sync one way, or both ways?
    For example, ServerA  has records 1,2,3, and 5.  ServerB contains records 1, 2, 3, and 4. Does row 4 need to go to ServerA, and row 5 need to go to ServerB?
    OR
    ServerA has 1, 2, 3, and 4, ServerB has 1, 2, 3.  Record 4 needs to go to ServerB.

    Are the structures of the tables the same?

    How do you determine what rows need to go where?  

    Here's one thought, which may work for both scenarios. Again, this is a semi-educated guess based upon the info provided. 
    1. Create a table on both servers that consists of the primary key to the affected table.
    2. Create a view that selects the primary key.  
    3. BCP OUT the results of the view from ServerA, BCP OUT the results of the view from ServerB
    4. BCP IN to the new table on each server. 
    5. Create a second view that selects the records from the table where they do not exist in the primary key table.
    6. BCP OUT the results from the view.
    7. BCP IN to the other server.

    BCP (or bulk insert) is a minimally logged operation, which should prevent the transaction log from filling up.  I suspect it will also be far faster.

    Thanks for your prompt response and sorry for not providing enough information. Table structure are same on both server (PK is combination of two columns) and need to be sync based on Source server i.e Server A and Server B.
    If there is missing rows in Server B then add from Server A, if not matching rows then delete from Server B based on A, and also update it based on Server A to server B.

    Please let me know if you require more information.


    CREATE TABLE [dbo].[testInfo(
        [EID] [nvarchar](80) NOT NULL,
        [SetID] [int] NOT NULL,
        [ParentSetID] [int] NULL,
        [ParentParentSetID] [int] NULL,
        [MainSetID] [int] NULL,
        [TestId] [nvarchar](255) NULL,
        [TestLocale] [nvarchar](255) NULL,
        [TestLifeCycleId] [int] NULL,
        [TestCode] [nvarchar](255) NULL,
        [TestCodeDescription] [nvarchar](255) NULL,
        [TestPriority] [nvarchar](255) NULL,
        [TestStatus] [nvarchar](255) NULL,
        [TestType] [nvarchar](255) NULL,
        [TestDepartment] [nvarchar](255) NULL,
        [FacilityId] [int] NULL,
        [LicenceNumber] [nvarchar](255) NULL,
        [LicenceDesc] [nvarchar](255) NULL,
        [TestBillLicence] [nvarchar](255) NULL,
        [TestBillLicenceDesc] [nvarchar](255) NULL,
        [TestShortName] [nvarchar](255) NULL,
        [TestWorkSheet] [nvarchar](255) NULL,
        [TestReportType] [nvarchar](255) NULL,
        [TestTATTarget] [nvarchar](255) NULL,
        [TestTATTargetHours] [int] NULL,
        [TestDirector] [nvarchar](255) NULL,
        [TestDirectorName] [nvarchar](255) NULL,
        [TestBilledFlag] [nvarchar](255) NULL,
        [TestOHIPCode] [nvarchar](255) NULL,
        [TestAddOnFlag] [nvarchar](255) NULL,
        [DepartmentSampleNbr] [nvarchar](255) NULL,
        [ResultsExpected] [nvarchar](255) NULL,
        [TestEntryDateTime] [datetime] NULL,
        [TestEntryDateTime_UTC] [datetime] NULL,
        [TestEntryDateTime_o] [smallint] NULL,
        [TestEntryDate] [nvarchar](255) NULL,
        [TestEntryTime] [nvarchar](255) NULL,
        [TestEntryUser] [nvarchar](255) NULL,
        [TestCollectionDateTime] [datetime] NULL,
        [TestCollectionDateTime_UTC] [datetime] NULL,
        [TestCollectionDateTime_o] [smallint] NULL,
        [TestCollectionDate] [nvarchar](255) NULL,
        [TestCollectionTime] [nvarchar](255) NULL,
        [TestResultingDateTime] [datetime] NULL,
        [TestResultingDateTime_UTC] [datetime] NULL,
        [TestResultingDateTime_o] [smallint] NULL,
        [TestResultingDate] [nvarchar](255) NULL,
        [TestResultingTime] [nvarchar](255) NULL,
        [TestResultingUser] [nvarchar](255) NULL,
        [TestApprovalDateTime] [datetime] NULL,
        [TestApprovalDateTime_UTC] [datetime] NULL,
        [TestApprovalDateTime_o] [smallint] NULL,
        [TestApprovalDate] [nvarchar](255) NULL,
        [TestApprovalTime] [nvarchar](255) NULL,
        [TestApprovalUser] [nvarchar](255) NULL,
        [TestFinalDateTime] [datetime] NULL,
        [TestFinalDateTime_UTC] [datetime] NULL,
        [TestFinalDateTime_o] [smallint] NULL,
        [TestFinalDate] [nvarchar](255) NULL,
        [TestFinalTime] [nvarchar](255) NULL,
        [TestOrderDateTime] [datetime] NULL,
        [TestOrderDate] [nvarchar](255) NULL,
        [TestOrderTime] [nvarchar](255) NULL,
        [TestOrderDay] [nvarchar](255) NULL,
        [TestOrderMonth] [int] NULL,
        [TestOrderYear] [int] NULL,
        [TestLastReprintDate] [datetime] NULL,
        [TestLastReprintDate_UTC] [datetime] NULL,
        [TestLastReprintDate_o] [smallint] NULL,
        [TestLastRevisedDate] [datetime] NULL,
        [TestLastRevisedDate_UTC] [datetime] NULL,
        [TestLastRevisedDate_o] [smallint] NULL,
        [AnalyzerId] [nvarchar](255) NULL,
        [AnalyzerName] [nvarchar](255) NULL,
        [InterfaceMachine] [nvarchar](255) NULL,
        [InterfaceMachineNumber] [nvarchar](255) NULL,
        [InterfaceMachineType] [nvarchar](255) NULL,
        [DORBench] [nvarchar](255) NULL,
        [DORBenchDesc] [nvarchar](255) NULL,
        [TestPrice] [decimal](18, 2) NULL,
        [TestPricingError] [nvarchar](255) NULL,
        [TestSpecialPrice] [decimal](18, 2) NULL,
        [TestBasePrice] [decimal](18, 2) NULL,
        [TestIsBillingPanel] [nvarchar](255) NULL,
        [TestPanelCode] [nvarchar](255) NULL,
        [TestClientCode] [nvarchar](255) NULL,
        [TestBillingType] [nvarchar](255) NULL,
        [TestNumberOfServices] [decimal](18, 2) NULL,
        [TestCodeType] [nvarchar](255) NULL,
        [TestTargetFinalDate] [datetime] NULL,
        [TestMetTarget] [nvarchar](255) NULL,
        [TestTATCollectionToApproval] [int] NULL,
        [TestTATCollectionToFinal] [int] NULL,
        [TestTATEntryToFinal] [int] NULL,
        [TestTATEntryToApproval] [int] NULL,
        [TestOrderMonthYear] [nvarchar](10) NULL,
        [DoctorAccountNumber] [nvarchar](255) NULL,
        [DoctorAddressID] [nvarchar](255) NULL,
        [CompositeID] [nvarchar](100) NULL,
    CONSTRAINT [PK_Order_TestInformation] PRIMARY KEY CLUSTERED 
    (
        [EID] ASC,
        [SetID] ASC
    )

  • EasyBoy - Wednesday, August 15, 2018 7:40 AM

    Hello,
    I need to sync 2 tables from different server. Both table contains about 1055088770 rows.About 70% records are sync but i need to sync rest 30% rows (update, insert or delete based on source). I need to do in a loop with smaller chunk so it wont fill up log space. I may use SQL red gate or other third party tool but this will create very large script within one update and commit transaction and it will fill up log space.
    Any custom script will be great help.

    1.  Is it a bi-directional sync (master-master) or just one way (master-slave)?
    2.  How often does such a sync need to be done or is it one of "those" requirements where they need to be in-sync almost instantly and always?

    DO NOT USE THE REDGATE (DATA COMPARE) TOOL FOR THIS!!!!  The Redgate tool creates 1 INSERT/VALUES statement per row and will take a month of Sundays to execute for large transfers along with all that goes with such RBAR methods.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Sorry... didn't see the rest of the writeup.  According to what you wrote, "Server A" will be the master and "Server B" will be the slave.  That answers question 1 above.  What about question 2?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden - Wednesday, August 15, 2018 8:24 AM

    Sorry... didn't see the rest of the writeup.  According to what you wrote, "Server A" will be the master and "Server B" will be the slave.  That answers question 1 above.  What about question 2?

    Regarding to second question, this need to be done once only. Basically, these tables are in transaction replication and someone deleted subscription. so, we stopped whatever populates into these two tables at publisher, sync the tables and put it back in replication.

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply