If you've set up replication without immediately
updating (or even queued updating in SQL2K) subscribers, you may see this error
if you allow write access to tables on the subscriber. If a user inserts a
record into the subscriber, then later that same key is generated on the
publisher, this error will result. Not granting insert permissions on the
subscriber is a great way to stop this ---- if your situation allows!
When it happens, the distribution agent will
shutdown. The usual way to fix this is to reinitialize the subscription and push
a new snapshot to the subscriber. If you're using SQL 7, SQL takes share locks
on tables during generation of the snapshot which will block your users. You
also have to look at how long it will take to get the new snapshot to the
subscriber (db size and bandwidth).
However, there are alternatives. One is to add
the -skiperrors parameter to the distribution agent and restart it. Not a great
solution, but it will work. You can either leave it in place or remove it once
you've made it past the "bad" record. Depending on how the error
occurred, this may leave you with incorrect data on the subscriber. You can then
join to the subscriber via linked server to update the record to match the one on
the publisher. Or just use this as an interim fix and do the snapshot after
hours.
Another alternative is to remove the
undistributed record from the distribution database. Basically you need to
identify the transaction and delete it from the MSRepl_Command table. One way to
find is to use sp_browsereplcmds - keep in mind that the publisher_database_id
is NOT the database ID - you need to look it up in MSPublishers_Databases to get
the actual database name. You're looking for the earliest insert transaction for
the database in question. On my test machine this returned:
So in this case the transaction we need to remove
is xact_seqno 0x000000110000014E0004. If you'll take a look at the error detail
in the first image above, you'll notice that the last command matches the
command in the output from sp_browsereplcmds, confirming our research. Now you
just need to execute:
use distribution go delete from msrepl_commands where xact_seqno=0x000000110000014E0004 go |
Restart the distribution agent. Your subscriber
still doesn't necessarily have the same data for that one row as the publisher,
so you can update the subscriber or redo the snapshot as discussed earlier.
The last alternative is figure out the duplicate
key and remove it from the subscriber. Looking at the text of the last command
from the error detail, you can see the values that are being passed to the
sp_MSins_Books stored procedure on the subscriber. A quick look at that sp on
the subscriber confirms that the first parameter is the primary key, so all we
need to do is connect to the subscriber and delete the record:
use Test1 go Delete from books where BookID=9 go |
Then restart the distribution agent. This method
keeps your subscriber in sync without having to make any other changes. And of
the three, I believe it represents the least risk of error.
For more
information about the -skiperror parameter see BOL or MSDN.