Queue Reader Agent error

  • Hi,

    I've been searching a long time now to find out how to get rid of this error.

    The replication agent has not logged a progress message in 10 minutes. This might indicate an unresponsive agent or high system activity. Verify that records are being replicated to the destination and that connections to the Subscriber, Publisher, and Distributor are still active.

    This issue has been raised a few times here and lots of times else where, but no solutions.

    http://www.sqlservercentral.com/Forums/Topic445184-291-1.aspx

    http://www.sqlservercentral.com/Forums/Topic1168869-391-1.aspx

    http://www.sqlservercentral.com/Forums/Topic1173906-291-1.aspx

    http://www.sqlservercentral.com/Forums/Topic1038860-1550-1.aspx

    Replication seems to be working fine. The big red circle with a cross in replication monitor makes us nervous though.

    The publisher that shows the cross has 8 publications with a total of 9 subscriptions (1 each except for one with 2 subscriptions).

    Windows 2003 (Clustered) SQL 2005 ==>> 2000 / 2005 / 2008.

    I have been comparing dbo.syspublications, dbo.syssubscriptions, dbo.sysarticles to those on a similar box with similar replications. I can't find any differences in values that would point a possible configuration difference.

    I have tried all sorts to try to resolve this.

    USE master

    exec sp_changedistributor_property

    @property = N'heartbeat_interval',

    @value = 5;

    Bigger or smaller interval only changes the number in the error.

    dibbydibby posted a query in one of the above links to see if the distribtor is overloaded.

    -- Commands per article

    selectag.name Publication, s.undelivcmdsindistdb, s.delivcmdsindistdb

    frommsdistribution_status s, msdistribution_agents ag, msarticles ar

    wheres.agent_id = ag.id and s.article_id = ar.article_id

    orderby undelivcmdsindistdb desc

    -- Commands per Publication

    selectag.name Publication, COUNT(*) Articles, SUM(s.undelivcmdsindistdb) undelivcmdsindistdb, SUM(s.delivcmdsindistdb) delivcmdsindistdb

    frommsdistribution_status s, msdistribution_agents ag, msarticles ar

    wheres.agent_id = ag.id and s.article_id = ar.article_id

    GROUPBY ag.name

    orderby undelivcmdsindistdb desc

    From this I gleamed that retrieving all 84 rows from msdistribution_status takes 15-30 secs. This is due to msrepl_commands having 7mils records. Is this a normal size?

    The troubled publisher only ocassionally has 20+ under column undelivcmdsindistdb. If anything, it looks like another server is having problems with very high undelivcmdsindistdb values, but it's not.

    I have also cleared out [Distribution].[dbo].msqreader_history then manually started job "Replication agents checkup". All agent_IDs are writing to this table.

    Somewhere someone posted this query. Though, not what you should be looking for.

    select 'dbcc inputbuffer(' + CAST(spid AS Varchar(5)) + ')' from sys.sysprocesses where program_name like '<Publisher>%'

    dbcc inputbuffer(73)

    Another suggestion was to run

    EXEC sp_browsereplcmds

    Which returns:

    Msg 6513, Level 16, State 27, Procedure sp_browsereplcmds, Line 110

    Failed to initialize the Common Language Runtime (CLR) v2.0.50727 due to memory pressure. Please restart SQL server in Address Windowing Extensions (AWE) mode to use CLR integration features.

    I suppose this may or may not be the cause. Got 16 cores / 16 GB RAM on this clustedered box. 99.98% buffer hitting. No other reason to believe memory is an issue.

    This just shows the unioned result msqreader_history tables between the troublesome publisher and the comparable publisher. Svr1 being the naughty one.

    SvrNameagent_idpublication_idrunstatusstart_timetimedurationcommentstransaction_idtransaction_statustransactions_processedcommands_processeddelivery_ratetransaction_ratesubscribersubscriberdberror_idtimestamp

    Svr11NULL12013-03-26 13:29:59.2972013-03-26 13:29:59.2970Starting agent.NULL00000NULLNULL00x0000000005AFB7AE

    Svr11NULL62013-03-26 13:29:59.2972013-03-26 13:35:00.453301The replication agent has not logged a progress message in 5 minutes. This might indicate an unresponsive agent or high system activity. Verify that records are being replicated to the destination and that connections to the Subscriber, Publisher, and Distributor are still active.NULL00000NULLNULL00x0000000005AFBA7D

    Svr21NULL42013-03-10 18:12:06.1602013-03-26 17:16:44.5301379078No queued transaction availableNULL00000NULLNULL00x0000000005A157F1

    Could use some help on this one.

    Cheers.

    P.S. Sorry for the long post.



    For better, quicker answers on T-SQL questions, read Jeff Moden's suggestions.[/url]

    "Million-to-one chances crop up nine times out of ten." ― Terry Pratchett, Mort

Viewing 0 posts

You must be logged in to reply to this topic. Login to reply