SQL Transaction Replication on a large DB

  • I want to create a transaction replication (SVR-A to SVR-B) from scratch on one of our DB. It is approx 280GB with 6ooo+ tables. I was simulating it on a test environment and publication of the articles takes awhile.

    [1] Is there a faster way to do this?

    [2] Before I create a subscription, the snapshot agent status should show 100% (all articles published) - Yes?

     

    TIA

     

     

     

     

  • If it's generating and applying the snapshot that is taking awhile, another option is to initialize replication from a backup:

    Initialize a Transactional Subscription Without a Snapshot

    Not sure what you are looking at with the agent status - maybe replication monitor, viewing the snapshot agent status. After the snapshot is generated, the status is Complete and the last action would say "[100%]. A snapshot of xx articles was generated"

    Sue

  • Thanks for the link!

    [2] Before I create a subscription, the snapshot agent status should show 100% (all articles published) - Yes?

    OK I guess I have to wait for a 100% snapshot before I create a subscription.

    The Snapshot Agent must run and generate a snapshot of the publication before the subscriptions can be initialized.

     

     

     

    Attachments:
    You must be logged in to view attached files.
  • When you select to initialize the subscription immediately then yes you need to wait for the snapshot to finish generating all the articles.

    Sue

  • Got it - thanks!

    Follow-up, when I created the publisher, the snapshot was pointing to the network path I specified [\\SERVER\SNAPSHOT].

    I see it created files and folder/subfolder as well.

    \\SERVER\SNAPSHOT\unc\SERVER_TESTDB_TESTDB_PUBLISHER\20200122103014\(28524 items created)

    I created the subscription and I got all the articles (all 6000+ tables) in the replicated server and replication was running as designed.

    Question is:

    [1] Do I still need the files in the SNAPSHOT folder (all 28524 items)?

    [2] If yes, when does this folder gets cleared out? I assumed that once the subscription reads the files and adds them into the replicated server, it deletes them.

    TIA

     

     

     

  • You can reinitialize using the existing snapshot when the files are still there. Otherwise, the distribution cleanup job deletes the files. The default is to clean up files older than 72 hours. You can check the job and by default executes the following:

    EXEC dbo.sp_MSdistribution_cleanup @min_distretention = 0, @max_distretention = 72

    Sue

  • I changed it to 1 hour and I see the files were removed. 🙂

    Thanks again!

     

  • So I was watching a youtube tutorial initializing from a backup.

    https://www.youtube.com/watch?v=2KYlIK4uW5Y&t=280s

    I was wondering, why didn't he create a SNAPSHOT AGENT (remarked out)  at all even if he was using initialization from backup. Wouldn't there be any use of a snapshot agent once you have replication set up and running?

     

    • This reply was modified 4 years, 3 months ago by  ARPRINCE.
  • It's not necessarily needed in that scenario. In transactional replication the snapshot agent is used for intializing the schema, data of the publication by creating the snapshot files. Those files are generated by the snapshot agent. If you are initializing from a backup, you don't need to generate those files. After that, it's the log reader agent and distribution agent that do all the work of reading the changes that need to be replicated and propagating those changes to the subscriber.

    Sue

  • Sue_H wrote:

    If you are initializing from a backup, you don't need to generate those files. After that, it's the log reader agent and distribution agent that do all the work of reading the changes that need to be replicated....

    The DB that is being published, when I initialize from backup, does the data there need to be static?

     

     

  • That would make it simpler wouldn't it? But no it doesn't have to be static but the transactions since the last full backup would need to be in the distribution database so that you don't miss any data. If not, you can often generate a differential or log backups and apply them to the subscriber. That's that part of using a backup that can make it more complex. Some of that is explained in this article and also has some queries you can do to check the LSNs:

    Deep Dive on Initialize from Backup for Transactional Replication

    Sue

     

  • How is non-static data handled if you initialize via Snapshot?

    For example, TABLE_A (10 records) gets a snapshot and is now on the snapshot folder.  TABLE_A gets updated, now has 12 records. What happens next?

    SUBSCRIBER reads the snapshot (10 records), then it gets it from the logs (2 records) after reading 100% snapshot?

     

     

  • In any case (backup or snapshot) it's the same type of process which is really the same process for transaction replication in general. The log reader picks up the transaction and the commands would be in the distribution database until the subscriber is ready for those to be applied. That's why when initializing from a backup, its often recommended to disable the distribution cleanup or to increase the retention time for the distribution cleanup as doing that would leave the commands in the distribution database longer so that they are available for the subscription.

    Sue

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply