MSREPL_COMMANDS not cleaning out

  • I've got a SQL 2008R2 Enterprise server, with SP2 with some possible replication issues. I started by reading the post in the 2005 section that had the same issue and trying to diagnoise from that: http://www.sqlservercentral.com/Forums/Topic1285555-291-1.aspx

    ... I have Immediate_Sync set to 0, my min threshold is 0, my max is 60 (was 96, I've been lowering it). My clean up job runs every 10 minutes and always finds 0 records to delete. I've been watching my MSRepl_Commands table grow by about 10 million records a day.

    I'm not sure what else to look at, at this point. I looked at MSdistribution_status and I can see UndelivCmdsInDistDB drop down to 0 records when replication runs and the DelivCmdsInDistDB grows. I've tried manually running EXEC dbo.sp_MSdistribution_cleanup @min_distretention = 0, @max_distretention = 60 and it comes back with "Removed 0 replicated transactions consisting of 0 statements in 0 seconds"

    I searched my agent job messages for the results of the clean up job, it has records going back to Oct. 2nd and none of the records indicate any records being deleted.

    SELECT * FROM dbo.sysjobhistory WHERE job_id = <cleanup job id> AND step_id = 1 AND message NOT LIKE '%Removed 0%'

    If I remove the Message NOT LIKE condition and just view all results, I can verify that all of them show "Removed 0 replicated transactions consisting of 0 statements"

    I tried running the cleanup command with a lower max threshold but it also, didn't delete any records.

    Digging into this more, I see that the exec dbo.sp_MSmaximum_cleanup_seqno call isn't returning a @max_cleanup_xact_seqno record.

    This means the query

    select top 1 @max_cleanup_xact_seqno = xact_seqno

    from MSrepl_transactions with (nolock)

    where

    publisher_database_id = @publisher_database_id and

    (xact_seqno < @min_xact_seqno

    or @min_xact_seqno IS NULL) and

    entry_time <= @min_cutoff_time

    order by xact_seqno desc

    returns nothing. I debugged it and it's doing a loop that populates the @min_xact_seqno variable with various values. Most would return a @max_cleanup_xact_seqno, but the last value placed in @min_xact_seqno isn't found at all in the msrepl_transactions table. So I'm wondering if this indicates an issue? Anyone know? I'm not familiar enough to feel confident with deleting things to make them go..but it looks like something is messed up, preventing the cleanup from getting a record to start cleaning on.

    Thanks.

  • Let me start with a simpler question, should the cleanup job be removing records from the MSrepl_commands table, or should that table just grow?

  • I want to put a final note on here in case someone comes across this in the future. I researched the min_xact_seqno number that wasn't returning any rows in my above query. This lead me to the MSrepl_commands table, which based on agent_id lead me to the MSdistribution_agents table. From here I figured out which job was causing the issues. I was able to see that our developers had decided they only needed to replicate from a set of tables once a month, because they only loaded data once a month. The Replication had expired on for the job because it hadn't been run for so long, so I scripted out and dropped it. When I re-ran the clean up job, it deleted over 175 Million records from the MSrepl_commands table.

  • Eric - This may not be related to your specific issue, but I was having a problem where MSrepl_commands wasn't dropping below a certain size and when the distribution cleanup job would run it would scan the entire MSrepl_commands table and not delete records.

    It turns out the issue was because two of my publications were set to "immediate sync". This had the effect of holding onto those commands until the retention rate, which for me was 48 hours. This sort of thing wasn't necessary as we don't have any anonymous subscriptions. Once I changed this setting and manually ran the cleanup job the MSrepl_commands table dropped down to acceptable levels. Here's the basic steps to accomplish this:

    USE Your_pub_DB

    GO

    EXEC sp_changepublication

    @publication = 'Your_pub',

    @property = 'allow_anonymous',

    @value = 'false'

    GO

    EXEC sp_changepublication

    @publication = 'Your_pub',

    @property = 'immediate_sync',

    @value = 'false'

    GO

    USE distribution

    GO

    EXEC dbo.sp_MSdistribution_cleanup @min_distretention = 0, @max_distretention = 48

    -Grant Roles

  • Hi Grant, thank you for the reply I appreciate it. I really did try my best to thoroughly document my experience. I said in my first post that I had set in immediate_Sync to no. the issue for me was caused by one of my replications not being a continuous run but being called on demand and having not been called for a month.

    Eric

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply