Sudden slow performance from distributor to subscriber in transactional replication?

  • We've had transactional replication enabled for 6-8 months to update a secondary database instance that's used for reporting purposes. It's a single subscriber, set to pull.

    At some point over the last few days the subscriber apparently stopped being able to keep up with the replicated data, and now I'm sitting on anywhere from 2-3 million undistributed commands at any given point. Stuff is making it from the distributor to the subscriber, just very slowly apparently.

    The problem of course being that I can't find any reason for this sudden change.

    Nothing has changed from a network/SAN/server perspective that we can find, there's no errors in the SQL or SQL agent logs, and windows event logs are also quite boring. These servers are connected via multiple 10GBe links to a core switch, and our networking gear looks pretty bored, same deal on our SAN.

    If I spend some time sifting through the commands in the distributor db, nothing really crazy jumps out at me other than the sheer volume of commands sitting in there. If I look on the subscriber all I see are updates being applied without any blocking or anything unusual going on there. If I insert a trace token, latency from Publisher to Distributor is 1-2 seconds, while Distributor to Subscriber has still been pending after around 7-8 hours. I've tried restarting the various SQL agent jobs, nothing really changes there.

    Any suggestions on what to look at? I'm running out of ideas here.

  • Is there any chance that your subscriber databases are near the maximum file size that's been set, and due to the amount of data being replicated, the files are constantly "growing"?

    ______________________________________________________________________________Never argue with an idiot; Theyll drag you down to their level and beat you with experience

  • MyDoggieJessie (2/28/2013)


    Is there any chance that your subscriber databases are near the maximum file size that's been set, and due to the amount of data being replicated, the files are constantly "growing"?

    This didn't seem to be the case, plenty of headroom still on the subscriber with the current data file sizes.

    muthyala_51 (2/28/2013)


    http://www.mssqltips.com/sqlservertip/1823/troubleshooting-slow-sql-server-replication-issue-due-to-distributor-database-growth/

    I did run across this in my searching, but it didn't appear to be the issue.

    That said, I went away for a few hours late last night to clear my head and when I came back it had magically fixed itself. Still no idea what the root cause was though, which doesn't exactly give me warm fuzzies

  • Tables that involve a lot of "changing data" aren't always the best candidates for Transactional Replication. In the past we had a table for "Transactions Not Processed", or a TNP table that constantly inserted records, processed them, removed them, over and over (and over) again. This caused havoc on our replication set up. I ended up having to drop the article from the publication altogether...

    Do you know how to query the distribution DB to determine which articles are the heaviest hitters?

    ______________________________________________________________________________Never argue with an idiot; Theyll drag you down to their level and beat you with experience

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply