Scaling Out the Distribution Database

  • Comments posted to this topic are about the item Scaling Out the Distribution Database

  • Excellent article, David.

  • Great article David!

    I particularly enjoyed reading about Adding a New Distributor to an existing Replication Topology.

    I would be interested to know your thoughts on implementing this but for a high availability scenario.

    So for example there could be One Publisher (a Read/Write server for an application) and say 8 Subscribers (Read Only Databases for an application) i.e. a typical search pool scenario. Let’s say the Distributor could not be taken down for more than a few seconds as the Publisher and Subscribers must remain within a reasonable sync time i.e. seconds.

    In such a scenario I am thinking that a solution could be to create an identical Publication, that points to a new Distributor and to move the Search Pool servers in and out of rotation one at a time. In other words, drop one subscription at a time and re-create using the new Publication (that references the new Distributor). This way the application remains operational whilst the migration is being implemented.

    My major concern would of course be ensuring the consistency of the data during throughout the migration process.

    I would be interested to know your thoughts and whether you have worked on a scenario such as this.

  • David,

    Can you elaborate on the following statement you made:

    the number of records in the MSRepl_transactions table is likely to be very high and once it has gone beyond a certain size

    and provide guidance on what that "certain size" is and how someone knows if they've exceeded it?

    Also, do you have any suggestions for other steps that someone should take to optimize their existing distribution database before deciding that it's time to create a new one?

    Kendal Van Dyke

    @sqldba

    Kendal Van Dyke
    http://kendalvandyke.blogspot.com/[/url]

  • It is very hard to say what "a certain size" is because it depends on the hardware and configuration of the machine that is acting as a distributor and also the amount of data running through the distributor.

    We have some databases that replicate quite narrow tables and these are good for tens, if not hundreds of millions of records. Others have wide tables and the distributors struggle if the number of replicated transactions goes into the low millions.

  • I probably asked the question poorly, so let me take another approach...let's say I've been tasked with managing a replication environment and I notice that things are starting to get slower and slower. I've read your article and start to wonder if I should add a second disribution database, but doing so requires setting up a maintenance window and a signficant amount of work on my part. What kinds of things should I look at to get a better feel for if adding a second distribution database is really the right course of action?

    Kendal Van Dyke
    http://kendalvandyke.blogspot.com/[/url]

  • Good article!

    I see that the Distributor Properties for the Transaction retention value is 72 hours, and the History retention value is 48 hours. These are default values. I'm curious if anyone changes these values, and if so why? Also, what are the consequences if these values are changed?

  • Great Article!

    I have all my published databases on one SQL Instance. I have a stand-alone Distributor server which gets beat up every so often. It would be nice if it were possible to configure one distributor database per published database. Or even one distributor database per publication, if one really wanted to do that. Hopefully, there is a good reason for having only one distributor database per published instance! But, I'm hoping that changes! 🙂

    Brannon Weigel

  • Good article. You did a good job of describing how to add distribution databases, which is something that I had never considered before.

    However, I am curious about the factors behind your recommendation. I would expect a bottleneck to occur on the CPU or I/O, but not the database itself. You referenced the MSRepl_transactions table getting too large, but I don't see that as being a problem because the clustered index is on (publisher_database_id, xact_seqno) which is int and varbinary(16). Would the concern be seeks because the index level would be deeper with more records? What advantage would there be assuming one disk array and I/O is not the bottleneck?

    I think you did great in the how to, but I am wondering about the why.

    Regards,

    Toby

  • On one particular publication we changed the retention period to 3 hours because if something goes wrong with the subscriber then the transactions start building up in the distributor and if that comes under stress the publisher log files start to bloat until disk space on the production box is threatened.

    It's worth mentioning that we do have to DBCC DBREINDEX or ALTER INDEX REBUILD regularly.

    The distribution databases are counted as system databases but they behave like user databases.

  • RML51, the transaction retention period is used by the distribution cleanup agent to mark subscriptions as inactive if they are further behind than the retention period and to remove transactions that are older than the specified period. How long commands and transactions are retained after they've been delivered to all subscribers depends on the immediate_sync setting for the publication. If TRUE, transactions will be retained until they are older than the retention period; if FALSE they will be removed by the distribution cleanup agent the next time it runs after the transaction has been delivered to all subscribers.

    Toby, I agree with you - David did a great job explaining the how but not the why. The way I see it, you're doing the same amount of IO whether or not you've got one distribution DB or ten. By setting up multiple distribution DBs you may even be adding IO overhead by making your disk heads jump around from one DB to another to read\write transactions and commands. Maybe I can see the logic in creating multiple distribution DBs if you are putting them on separate physical drives...but if you've got those drives available I'd just keep on distribution DB and spread it across as many spindles as you can to begin with.

    When it comes to replication the idea is to keep the distribution database lean and mean. If you've got distribution DB growth issues I'd look at identifying the root cause first. Make sure immedite_sync is set to FALSE and that your distribution cleanup agent is running.

    Chris Skorlinski from the Microsoft SQL Server Replication Support Team has two great articles about distribution DB growth that I recommend reading before considering the more invasive approach of creating multiple distribution DBs:

    How to resolve when Distribution Database is growing huge (+25gig)

    How Replication setting Immediate_sync may cause Transactional Replication Distribution database growth

    FWIW I asked Chris about the author's suggestion and he indicated that he's never had a case where he found it necessary to create multiple distribution DBs to alleviate performance issues.

    Kendal Van Dyke
    http://kendalvandyke.blogspot.com/[/url]

  • David.Poole (3/30/2010)


    On one particular publication we changed the retention period to 3 hours because if something goes wrong with the subscriber then the transactions start building up in the distributor and if that comes under stress the publisher log files start to bloat until disk space on the production box is threatened.

    David,

    I'm curious - for those publications is immediate_sync set to TRUE? Or is it set to FALSE and you're just writing off any subscriber that falls further than 3 hours behind?

    Kendal Van Dyke
    http://kendalvandyke.blogspot.com/[/url]

  • FWIW I asked Chris about the author's suggestion and he indicated that he's never had a case where he found it necessary to create multiple distribution DBs to alleviate performance issues.

    Great discussion!

    For what it's worth, I have seen scenarios where multiple Distribution databases provide a performance improvement.

    For example, let’s say we have two Publications that form part of an application platform. One Publication is sourced from a very intensive OLTP database and another Publication has relatively moderate activity. Given that both Publications share the same Distributor database, the OLTP Publication could be responsible for 80% of the overall activity to the Distributor, thereby adversely affecting/constraining the performance of any other Publications that share the same Distributor.

    By providing a resource intensive Publication its own dedicated Distributor database, you can isolate the performance from other Publications/Applications.

    Who knew Replication could be such fun 🙂

  • For example, let’s say we have two Publications that form part of an application platform. One Publication is sourced from a very intensive OLTP database and another Publication has relatively moderate activity. Given that both Publications share the same Distributor database, the OLTP Publication could be responsible for 80% of the overall activity to the Distributor, thereby adversely affecting/constraining the performance of any other Publications that share the same Distributor.

    Were the distribution databases on separate physical drives? I could see why that would help, but if they're on the same set of spindles I'm still not convinced that if your indexes and statistics are in good shape that having multiple databases offers a significant performance improvement. (I'll buy that it may offer a marginal improvement in some very specific circumstances)

    Kendal Van Dyke
    http://kendalvandyke.blogspot.com/[/url]

  • Great article David.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

Viewing 15 posts - 1 through 15 (of 34 total)

You must be logged in to reply to this topic. Login to reply