How to Break Replication

  • Comments posted to this topic are about the item How to Break Replication

  • Nice article. I've spent a good chunk of the last year working on transactional replication (about 400k commands/day) from Oracle -> SQL 2005. It was already set up by the client, and it took many months to undo/redo (and learning all the way) a lot of the issues.

    Some additional thoughts:

    1> Optimize, optimize, optimize! If you think that the application of replicated commands are slow, use Profiler to see why they are.

    2> Make sure your codepages match. In my case, Oracle was set up handling all sorts of characters that SQL isn't. So the Log Reader would hurl, causing calamity.

    3> Don't try several fixes at once (a variant of the "ready, fire, aim" mentality so many ITfolks like). Remember that a lot of times, it takes a while for a "fix" to "take."

    4> Take a look at some of the isolation modes that SQL 2005 brings to the game. Snapshot Isolation mode can really cut down on some of the locks.

    5> Understand how SQL creates transactions out of replication. Then explain it to me . If the Log Reader stays offline for a bit, the commands accumulate on the Publisher side, and are essentially consumed and processed as single transactions. This can create a single transaction of huge size (that really isn't a transaction per se). This can really dog performance.

    Thanks for the article!

    Bob Coppedge


  • That's a nice article.

    Another common issue with replication is the famous error:

    The row was not found at the Subscriber when applying the replicated command.

    (Source: SQL-1 (Data source); Error number: 20598)

    I get a lot of this in a transactional replication and I haven't figure out yet what is causing it! There's no process that deletes data at the subscriber end so I am a bit confused at the moment...

    Have you experienced this problem before? How did you dealt with it?


  • Hi Poole,

    I have been working on Replication since 2002, and faced many problems with SQL 2000. With every new release of Service Pack some problems got solved. There are some problems still unattended, still I have to spend some time with the Conflict Viewer unnecessarily.

    With SQL 2005 all the problems got rectified. As a person suffering from SQL 2000, I loved the new version. I loved the way we can filter the data with joins.

    I have faced couple of problems with SQL 2005 as well

    Problem1: One table didn't had Primary key and had duplicate rows. I wanted to assign a Primary key for the table, so I deleted all the duplicate rows then created the Primary Key. I started the replication agent, to my surprise, agent failed to replicate saying could not create Index due to duplicate records. The agent was trying to create the Index before deleting the rows. The agent executes the DDL changes first then data. I had to delete the rows manually in the replication database and then restart the agent.

    Problem2: I have a Merge Replication setup which involves many tables(around 60) with filters, broken into 20 tables per replication, distributed to 5 different servers with unique filter for each server. When I insert a row in one of the key table which is used as join filter for most of filtered tables, it takes 12 hours to complete the job and consumes 27 GB of hard disk space for the log file. Its little bizarre why it needs some much of space and time. I have to make room for the Log file before executing the Insert statement. Luckily, I have 30 GB free space in my hard disk, otherwise what will happen I dont know.


  • Dave,

    A very good article! There is so little good information available on troubleshooting replication problems, so this is a valuable addition.

    Original author: 1-click install and best practice configuration of SQL Server 2019, 2017 2016, 2014, 2012, 2008 R2, 2008 and 2005.

    When I give food to the poor they call me a saint. When I ask why they are poor they call me a communist - Archbishop Hélder Câmara

  • David, rather than spelunk through ms_replcmds to find the offending row, there are two simpler alternatives. One is to add the error number as a '-skiperror' parameter to the agent, the other is to modify the proc on the subscriber to handle the condition. For example, the default delete proc raises an error if the rowcount <> 1. Just remove that code and the bad deletes get absorbed.

  • Some thoughts...

    1. Replication can be very troublesome, that's true, however, in general, I found the system reasonable in one way or another, so I do believe in determinism 🙂

    2. I didn't experience snapshot locking - on a table with a size of 13GB serving out hundreds/thousands of requests per minute.

    3. nosync init: yes, it's really weird that it doesn't create the stored procedures, BUT you can create them extremely easily:

    sp_scriptpublicationcustomprocs @publication = publication_name

    4. And my favorite scenario wasn't discussed: Replication chaining, that is, publishing something I subscribed to 🙂 - that makes your life a fun 🙂

  • Nice article.

    I want to comment on your removal of the guts of the delete stored procedures by telling you to look up "Deferred updates".

    If you update a column with a unique constraint on it, replication translates this into a delete and an insert - so be careful removing the guts of a delete procedure because they are sometimes used for updates as well.

  • Great article. Replication has been one of those "spooky" things that I have wanted to play with but setting up a test environment to learn in and then willing myself to break and fix it when I have a hundred other things on my plate means that I just haven't got round to it.

    The main bit that I'm unsure of, and I think you touched on it briefly without meaning to, is that we have what I'd hope is now a largely off-the-shelf product but with a database that I'm modifying (new tables, columns, etc) on a reasonably regular basis. I make change scripts to move the database from version to version but I haven't wanted to worry about having subscribers with data and making changes to articles, etc, etc... In SQL 2005, does it pick up table schema modifications and replicate them too? That'd be nice 😀

    I'll certainly add this article (and reference the thread here) to my briefcase for later reference. Thanks!

  • Good article. My experience is that replication works well, but remains tricky (particularly when your structure changes regularly). A test environment is a must, and scripting and documenting everything takes a lot of the hard work out of it.

  • Michael Earl (3/6/2008)

    Nice article.

    I want to comment on your removal of the guts of the delete stored procedures by telling you to look up "Deferred updates".

    If you update a column with a unique constraint on it, replication translates this into a delete and an insert

    Thanks for this tip Michael, you have almost certainly solved along running problem with this.

  • giorgio (3/6/2008)

    Another common issue with replication is the famous error:

    The row was not found at the Subscriber when applying the replicated command.

    (Source: SQL-1 (Data source); Error number: 20598)

    I've come across it when someone has not realised that the subscriber IS a subscriber and makes the change there instead of at the publisher end.

    I would be tempted to comment out the bit of the sp_MSdel_ stored procedure that throws an error when a record cannot be found but the problem is the delete may be part of a much bigger transaction and I want to trap problem transactions.

    This is also part of the reason why the -skiperror option on the agent profile isn't used. Its valid in some situations but in mine, knowing that replication has problems is important.

  • Hi David,

    Wonderful article and thanks for putting it, I had similar situations were I had issues with PK violations at the subscriber.

    I recently tried to setup transactional repl between 2005 and 2000 using few tables from AdventureWorks in 2005 but I keep getting this error when the sync happens:

    " Line 3: Incorrect syntax near '.'. "

    Any ideas about this,


    Razi, M.

  • If its Adventureworks that you are replicating to 2000 I would check to see if the SQL2005 schema's are the issue.

    Failing that, run the sp_MSins stored procedure at your subscriber with some known arguments to see if it still fails.

  • Good article.

    I have not experiencing any problems with disabling a delete command using:

    EXEC sp_changearticle @publication = 'pubname', @article = 'articlename', @property = 'del_cmd', @value = 'NONE'

    Has anyone else?

    Can someone please explain what an expired subscription is?

Viewing 15 posts - 1 through 15 (of 20 total)

You must be logged in to reply to this topic. Login to reply