Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 

Replication Gotcha - Columns with Defaults

By Andy Warren,

If you're read my assorted articles over the past couple years you'll know that I've done a fair bit with transactional replication. I wouldn't call myself an expert with it, but I'm comfortable using it in production and dealing with the minor mishaps that occur. Still, there is always something waiting to surprise you! Thus, our story begins.

One of the great features of SQL2K over SQL7 is the ability to add/drop columns on tables that are part of a replication publication without having to drop all the subscribers. Pretty painless, you can do it through Enterprise Manager using the publication properties or using the sp_repladdcolumn/sp_repldropcolumn stored procs. I've used these a lot and they work well. If you're not familiar with them basically they add/remove the column on the table, update the publication so that the article includes the change, then replicates the change to all the subscribers. Note that you can do add/delete only, not actually change the definition of a column. This only works if you use either of the methods I mention above - if you just add the column directly to the table it will be there, but will not be part of the publication or be propagated to the subscribers. This let's you decide which behavior you prefer.

On this particular day I needed to remove a column that was part of a publication. I used EM (faster than looking up the rarely used proc syntax), executed the drop, got an error message. When I went to check the results, I wound up with the column gone from the article, gone from the publisher, still present on the subscriber, and the subscriber marked as failed! What the heck, right? I did a new snapshot and made a note to try to reproduce the behavior later. Not that painful, but it turned a 1 min chore into 15 minutes.

I finally had some time today to experiment, started by making a copy of Northwind and adding a column called 'TestColumn' with a default of 0 to the customers table.

Next I created a publication with the customer table as the only article, winding up with this:

The next step was to try to drop the column. Retracing my earlier footsteps I clicked on the 'Filter Columns' tab, selected the column, and clicked the Drop Selected Column button.

Here is the standard confirmation message - clicked Yes:

I then received this error message:

Which sort of makes sense. The EM table designer is smart enough to handle stuff like this for you, this is just doing a straight alter/drop. I then checked the publication expecting to see the column gone, but it was still there? Was still in the publisher table and in the subscriber table as well. Losing my mind perhaps? Just to make sure the world was still sane I removed the default from the column and redid the drop, it worked fine. So either I didn't accurately note what happened the first time, or it's not reproducible.

The only alternate scenario that I could think of was that the default must have been on the subscriber. I added the column back with a default as before, but this time I removed the default from the publisher and verified the default was present on the subscriber. Ran through the sequence again, no error message, appeared to complete successfully. Column gone from the article, removed from the publisher table. Refreshed replication monitor and see the sign of bad things. Checking the distribution agent revealed this:

The entire text of the last command was as follows:

if exists (select * from syscolumns where name='TestColumn' and id = object_id('Customers')) begin if exists (select * from sysobjects where name='syspublications') if exists (select * from sysarticles where objid=object_id('Customers')) and @@microsoftversion >= 0x07320000 exec sp_repldropcolumn @source_object=N'[Customers]',@column=N'TestColumn' else alter table [Customers] drop column [TestColumn] else alter table [Customers] drop column [TestColumn] end

At this point the subscription is broken. Restarting the distribution agent just returns the same error. Thinking that it might be possible to avoid a snapshot, I executed sp_browsereplcmds to see what was queued, revealing this:

This is just a readable version of info from the msrepl_commands table in the distribution database. Xact_seqno 0x00000054000000100038 looks to be the one that matches the error message, so the next step is to query the table directly. Wind up with this:

Did some experimenting, wound up deleting all rows that had a type = 1073741859. Don't try this in production! Ran the distribution agent, it ran without error. Time to verify. I insert a row on the publisher, get this error:

Which makes sense. The publisher doesn't have the column so it's not passing a value/placeholder for the TestColumn column, but the stored proc that replication uses on the subscriber still expects it. Here is what the subscriber proc looks like (autocreated by replication):

ALTER procedure [sp_MSins_Customers] @c1 nchar(5),@c2 nvarchar(40),@c3 nvarchar(30),@c4 nvarchar(30),@c5 nvarchar(60),@c6 nvarchar(15),@c7 nvarchar(15),@c8 nvarchar(10),@c9 nvarchar(15),@c10 nvarchar(24),@c11 nvarchar(24),@c12 int
AS
BEGIN

insert into [Customers]( [CustomerID], [CompanyName], [ContactName], [ContactTitle], [Address], [City], [Region], [PostalCode], [Country], [Phone], [Fax], [TestColumn] )
values ( @c1, @c2, @c3, @c4, @c5, @c6, @c7, @c8, @c9, @c10, @c11, @c12 )

I'm going to modify it manually to remove TestColumn and rerun the agent. Agent runs, row is present on the subscriber. In production you'd want to modify the update procedure as well. At this point you can drop the TestColumn from the subscriber.

I did a web search on msrepl_commands, found nothing to indicate what the type column means. As far as I can tell the value 1073741859 indicates a schema change, either add or drop. Anyone have more info?

I think what I found out is that at some point either the default existed on the publisher and subscriber and was deleted from the subscriber, or the default was just added to the subscriber. Executing the drop column failed because the default existed, which stopped the distribution agent. At that point the easiest (and safest) fix is to just execute a new snapshot. Alternatively you can try removing the rows from msrepl_commands and update the related stored procs manually. Doable, but has some risk, I'd say mainly that you might remove rows that aren't part of the problem and not realize it, or perhaps worse, affect another publication. If you were to goof you could fix it by doing a new snapshot. The short story is to make sure you've removed defaults from the column on both publisher and subscriber before you try to drop the column.

Total article views: 7439 | Views in the last 30 days: 4
 
Related Articles
FORUM

FK selection (an nvarchar(30) column or four int columns)

FK selection (an nvarchar(30) column or four int columns)

FORUM
FORUM

Identifying ASCII characters in NVARCHAR columns

Identifying ASCII characters in NVARCHAR columns

FORUM

Can I encrypt ntext and nvarchar(max) columns ?

Can I encrypt ntext and nvarchar(max) columns ?

SCRIPT

Conversion of default objects to column defaults

This script can convert the usage of bound default objects in tables to column defaults

Tags
replication    
sql server 7    
 
Contribute

Join the most active online SQL Server Community

SQL knowledge, delivered daily, free:

Email address:  

You make SSC a better place

As a member of SQLServerCentral, you get free access to loads of fresh content: thousands of articles and SQL scripts, a library of free eBooks, a weekly database news roundup, a great Q & A platform… And it’s our huge, buzzing community of SQL Server Professionals that makes it such a success.

Join us!

Steve Jones
Editor, SQLServerCentral.com

Already a member? Jump in:

Email address:   Password:   Remember me: Forgotten your password?
Steve Jones