Blog Post

Replication Gotcha - Including An Article In Multiple Publications

,

When administering replication topologies it's common to group articles into publications based on roles that subscribers fulfill. Often you'll have multiple subscriber roles and therefore multiple publications, and in some cases a subset of articles are common between them. There's nothing to prevent you from adding the same article to more than one publication but I wanted to point out how this can potentially lead to major performance problems with replication.

Let's start with a sample table:

CREATE TABLE [dbo].[ReplDemo]
   
(
     
[ReplDemoID] [int] IDENTITY(1, 1) NOT FOR REPLICATION
                        
NOT NULL ,
     
[SomeValue] [varchar](20) NOT NULL ,
     
CONSTRAINT [PK_ReplDemo] PRIMARY KEY CLUSTERED ( [ReplDemoID] ASC )
       
ON [PRIMARY]
   
)
ON  [PRIMARY]
GO

Now let's pretend that we need this table replicated to two subscribers which have different roles. We'll create one publication for each role and add the table to both publications:

-- Adding the transactional publication
EXEC sp_addpublication @publication = N'ReplDemo Publication A',
   
@description = N'Publication to demonstrate behavior when same article is in multiple publications',
   
@sync_method = N'concurrent', @retention = 0, @allow_push = N'true',
   
@allow_pull = N'true', @allow_anonymous = N'false',
   
@enabled_for_internet = N'false', @snapshot_in_defaultfolder = N'true',
   
@compress_snapshot = N'false', @ftp_port = 21, @ftp_login = N'anonymous',
   
@allow_subscription_copy = N'false', @add_to_active_directory = N'false',
   
@repl_freq = N'continuous', @status = N'active',
   
@independent_agent = N'true', @immediate_sync = N'false',
   
@allow_sync_tran = N'false', @autogen_sync_procs = N'false',
   
@allow_queued_tran = N'false', @allow_dts = N'false', @replicate_ddl = 1,
   
@allow_initialize_from_backup = N'false', @enabled_for_p2p = N'false',
   
@enabled_for_het_sub = N'false'
GO
EXEC sp_addpublication_snapshot @publication = N'ReplDemo Publication A',
   
@frequency_type = 1, @frequency_interval = 0,
   
@frequency_relative_interval = 0, @frequency_recurrence_factor = 0,
   
@frequency_subday = 0, @frequency_subday_interval = 0,
   
@active_start_time_of_day = 0, @active_end_time_of_day = 235959,
   
@active_start_date = 0, @active_end_date = 0, @job_login = NULL,
   
@job_password = NULL, @publisher_security_mode = 1
GO
-- Adding the transactional articles
EXEC sp_addarticle @publication = N'ReplDemo Publication A',
   
@article = N'ReplDemo', @source_owner = N'dbo',
   
@source_object = N'ReplDemo', @type = N'logbased', @description = N'',
   
@creation_script = N'', @pre_creation_cmd = N'drop',
   
@schema_option = 0x00000000080350DF,
   
@identityrangemanagementoption = N'manual',
   
@destination_table = N'ReplDemo', @destination_owner = N'dbo', @status = 8,
   
@vertical_partition = N'false',
   
@ins_cmd = N'CALL [dbo].[sp_MSins_dboReplDemo]',
   
@del_cmd = N'CALL [dbo].[sp_MSdel_dboReplDemo]',
   
@upd_cmd = N'SCALL [dbo].[sp_MSupd_dboReplDemo]'
GO

-- Adding the transactional publication
EXEC sp_addpublication @publication = N'ReplDemo Publication B',
   
@description = N'Publication to demonstrate behavior when same article is in multiple publications',
   
@sync_method = N'concurrent', @retention = 0, @allow_push = N'true',
   
@allow_pull = N'true', @allow_anonymous = N'false',
   
@enabled_for_internet = N'false', @snapshot_in_defaultfolder = N'true',
   
@compress_snapshot = N'false', @ftp_port = 21, @ftp_login = N'anonymous',
   
@allow_subscription_copy = N'false', @add_to_active_directory = N'false',
   
@repl_freq = N'continuous', @status = N'active',
   
@independent_agent = N'true', @immediate_sync = N'false',
   
@allow_sync_tran = N'false', @autogen_sync_procs = N'false',
   
@allow_queued_tran = N'false', @allow_dts = N'false', @replicate_ddl = 1,
   
@allow_initialize_from_backup = N'false', @enabled_for_p2p = N'false',
   
@enabled_for_het_sub = N'false'
GO
EXEC sp_addpublication_snapshot @publication = N'ReplDemo Publication B',
   
@frequency_type = 1, @frequency_interval = 0,
   
@frequency_relative_interval = 0, @frequency_recurrence_factor = 0,
   
@frequency_subday = 0, @frequency_subday_interval = 0,
   
@active_start_time_of_day = 0, @active_end_time_of_day = 235959,
   
@active_start_date = 0, @active_end_date = 0, @job_login = NULL,
   
@job_password = NULL, @publisher_security_mode = 1
GO
-- Adding the transactional articles
EXEC sp_addarticle @publication = N'ReplDemo Publication B',
   
@article = N'ReplDemo', @source_owner = N'dbo',
   
@source_object = N'ReplDemo', @type = N'logbased', @description = N'',
   
@creation_script = N'', @pre_creation_cmd = N'drop',
   
@schema_option = 0x00000000080350DF,
   
@identityrangemanagementoption = N'manual',
   
@destination_table = N'ReplDemo', @destination_owner = N'dbo', @status = 8,
   
@vertical_partition = N'false',
   
@ins_cmd = N'CALL [dbo].[sp_MSins_dboReplDemo]',
   
@del_cmd = N'CALL [dbo].[sp_MSdel_dboReplDemo]',
   
@upd_cmd = N'SCALL [dbo].[sp_MSupd_dboReplDemo]'
GO

After creating the publications we create our subscriptions, take & apply the snapshot, and we're ready to start making changes so we execute this simple insert statement:

INSERT  INTO dbo.ReplDemo
       
( SomeValue )
VALUES  ( 'Test' )

Here's the million dollar question: How many times does this insert statement get added to the distribution database? To find out we'll run the following statement on the distributor (after the log reader agent has done it's work, of course):

SELECT  MSrepl_commands.xact_seqno ,
       
MSrepl_commands.article_id ,
       
MSrepl_commands.command_id ,
       
MSsubscriptions.subscriber_id
FROM    distribution.dbo.MSrepl_commands AS [MSrepl_commands]
       
INNER JOIN distribution.dbo.MSsubscriptions AS [MSsubscriptions] ON MSrepl_commands.publisher_database_id = MSsubscriptions.publisher_database_id
                                                             
AND MSrepl_commands.article_id = MSsubscriptions.article_id
       
INNER JOIN distribution.dbo.MSarticles AS [MSarticles] ON MSsubscriptions.publisher_id = MSarticles.publisher_id
                                                             
AND MSsubscriptions.publication_id = MSarticles.publication_id
                                                             
AND MSsubscriptions.article_id = MSarticles.article_id
WHERE   MSarticles.article = 'ReplDemo'
ORDER BY MSrepl_commands.xact_seqno ,
       
MSrepl_commands.article_id ,
       
MSrepl_commands.command_id

Here's the output of the statement:

Query Results

That's one row for each publication the table article is included in. Now imagine that an update statement affects 100,000 rows in the table. In this example that would turn into 200,000 rows that will be inserted into the distribution database and need to be cleaned up at a later date. It's not hard to see how this could lead to performance problems for tables that see a high volume of insert\update\delete activity.

Workarounds
Two workarounds for this behavior come to mind:

  1. Modify data using stored procedures, then replicate both their schema and execution. This won't help for insert statements and is useless if you're only updating\deleting a single row each time the procedure executes. This also assumes that all dependencies necessary for the stored procedure(s) to execute exist at the subscriber
  2. Limit table articles to one publication per article. If you're creating publications from scratch then place table articles that would otherwise be included in multiple publications into their own distinct publication. If you're working with existing publications that already include the table article then subscribe only to the article(s) that you need rather than adding the article to another publication. (Subscribing to individual articles within a publication can get tricky - I'll demonstrate how to do this in a future post)

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating