Log reader - Cannot execute sp_replcmds Timout expired

  • I've got a fairly new replication setup here, been working since thursady last week.

    I came in this morning and found the log reader agent stopped. I tried to restart it and got the following error

    Cannot execute sp_replcmds on the publisher. The reason was ODBC timeout expired.

    I edited the agent profile and changed the batch size to 50 (from the default 500) and upped the query timeout to 900. I got one batch through (50 transactions 77 commands) but the next batch again caused a timeout.

    I stopped the log reader and tried to run sp_replcmds in Query analyser also got a timeout.

    When I check the current activity pane in EM, I can see the log reader process waiting with a wait type of IO_COMPLETION.

    The only lock that the log reader has is a shared db lock.

    Has anyone else experienced this problem? Anyone got any suggestions?

    Thanks

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Be very careful about the log reader. You will be unable to issue sp_replcmds calls unless you issue an sp_replflush before. The reason is the old log reader agent still is holding on the the transaction log.

    You seem to have gotten around this though.

    The log reader is a high performance applet and normally can read mgs of the log without problems. However if you are not maintaining your log, or if you have open transactions it may have to much through gigs of the log before it can extract commands and mark the transactions as read so you can truncate it.

    You shoud run dbcc opentran to see if there are any open transactions which are causing the log to swell and degrading the log reader agents' performance. These transactions will show up as a

    Oldest active transaction:

        SPID (server process ID) : 53

        UID (user ID) : 1

        Name          : user_transaction

        LSN           : (39:498:1)

        Start time    : Sep 14 2004  8:08:13:950AM

     

    If you see this

            Oldest non-distributed LSN : (34535:345:345)

    That means there are still transactions in the log waiting to be extracted.

    One more point. Sometimes if your server is heavily pounded there are insuffifient resources available for the log reader to do its stuff and you will get error messages similar to what you are seeing. These errors can be cleared by reboots.

    DBCC execution completed. If DBCC printed error messages, contact your system administrator.

     

    --

    Hilary Cotter

    Looking for a SQL Server replication book?

    http://www.nwsu.com/0974973602.html

  • Thanks, I'll bear that in mind if this happens again.

    I had to remove the replication this morning, as I was no closer to a resolution than yesterday and the publisher logs were growing too large for comfort (about 12 hours away from running out of hdd space). I'll be replacing it this evening.

    The transaction logs are backed up every 30 minutes. The log reader started giving problems after a reboot of the publisher on sunday morning (08h30)

    The one thing that I suspect is a db maintenance plan on the publishing database which does a reorder of index and data pages starting at 00h30 sunday morning. This weekend it ran for 7 hours. I ran a test in my dev environment with a similar main plan on a published database and it had no effect on the replication (and doesn't appear to be a replicable transaction) I do wonder however if perhaps the long duration caused problems. I'm not sure if that's a online or offline operation.

    Since the problems started on a sunday, I doubt that the publishing server was under heavy load. It is also rebooted every morning at 07h00. The problems did not go away with a reboot.

    When I was playing in QA, I noticed that I could execute a sp_replcmds 4 and I got response in 0 seconds, however a sp_replcmds 5 was still executing 5 min later.

    Thanks for your help.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply