Replication Error "field size too large"


Yesterday, I was fighting with an error I have never seen in the replication and not much info can be obtained via google either.

My replication environment is pretty much complex, with multiple pulishers to one single subscriber, and also one publisher to multiple subscribers. But they are all transactional replications (some with updatable subscriptions)

A few days ago, our databases got updated by the development team, new stored procedure, new tables, new columns etc were made to the existing dbs. After that, I set up the replication on the test servers, and soon I got the following errors in some of the publications:

Error messages:

The process could not bulk copy into table '"dbo"."Table_PUB"'. (Source: MSSQL_REPL, Error number: MSSQL_REPL20037)
Get help: http://help/MSSQL_REPL20037

Field size too large

To obtain an error file with details on the errors encountered when initializing the subscribing table, execute the bcp command that appears below. Consult the BOL for more information on the bcp utility and its supported options. (Source: MSSQLServer, Error number: 20253)
Get help: http://help/20253

bcp "DB_Report"."dbo"."Table_Pub" in "\\Server1\repldata\unc\Prod1_DB1_REPORTINGPUB5\20080602182491\Table_Pub_2.bcp" -e "errorfile" -t"\n<x$3>\n" -r"\n<,@g>\n" -m10000 -SProd2 -T -w (Source: MSSQLServer, Error number: 20253)
Get help:

When I tried to run the following bcp command as indicated in the error msg,

bcp "DB_Report"."dbo"."Table_Pub" in "\\Server1\repldata\unc\Prod1_DB1_REPORTINGPUB5\20080602182491\Table_Pub_2.bcp" -e "errorfile" -t"\n<x$3>\n" -r"\n<,@g>\n" -m10000 -SProd2 -T -w

I did not go very far as I got the error msg, like

"unexpected EOF encountered in bcp data file

0 rows copied "

and if I check the errorfile, it has nothing as it is 0-byte size large.

After numerous different attempts, I finally figured out why.

In this case, we have two publishers (A and B) and one subscriber (S). The two publishers publish a same table, let's call it TABLE_PUB, but TABLE_PUB on A and B have different data sets, From A to S, once the snapshot starts, the replication will "drop existing one and create a new one" if TABLE_PUB already exists on S. For B to S, once the snapshot starts, the replication will "keep existing object unchanged" if TABLE_PUB already exists on S.

Recently our developer team add some new columns to TABLE_PUB on both A and B, however, the newly added column sequence for TABLE_PUB on A and B is different. For example, three new columns (C1, C2, C3) are added, but on A, the columns are added in sequence of C1, C2, C3 while on B, the column sequence is C3, C2, C1.

So during the snapshot initiated from B to S, when BCP tries to load \\Server1\repldata\unc\Prod1_DB1_REPORTINGPUB5\20080602182491\Table_Pub_2.bcp into Table_PUB on S, because Table_Pub_2.bcp is created based on Table_Pub on B, thus each column has its specific position, which cannot match the column positions of Table_Pub on S, (Table_Pub on S has same column sequence as that on A after snapshot initiated from A), this caused the error.

Hope this can be of help to you in future.

New findings:

After I posted this blog, two weeks later, I encountered this error again on another publication article, let's call it Table_Pub2. However, after compairng the Table_Pub2 on both publishers A and B (and also on the subscriber S), it seems the table schema is the same, in terms of column sequence, colum data types and sizes, across the servers, But the error persists even after I have run snapshots / subscription reinit multiple times, I even drop the publication / subscription and reset it up, the error simply refused to go. At the very end, I dropped the Table_Pub2 on publisher B, and then scripted out Table_Pub2 from A, and then recreate the Table_Pub2 on B using the scripts from A, and finally did a resnapshot on both A and B for this publication, and then everything works fine ever since.

So the lesson learned here is:  

If you see this error and you cannot find the obvious column difference, try to manually recreate the table on one publisher with the script that creates the published table from another publisher.