Reducing the size of the distribution database & improving transactional replication performance: Part 1

In this series of blog posts I will looking at issues regarding the size and performance of the distribution database as part of a transactional replication environment.

Symptoms:

Your distribution database is growing quite large
The distribution cleanup job is taking a long time to run, yet not clearing as much data as you would expect
Disk reads are high throughout the day regardless of system throughput

When using transactional replication all of the data to synchronise with the subscriber is held in the MSrepl_commands and MSrepl_transactions tables. Once transactions have been committed at the subscriber you probably want them to be cleaned up immediately by the Distribution clean up job, however this may not be happening. Depending upon the amount of transactions being replicated this can cause the distribution database to grow significantly and impact system performance.

The distribution clean up job runs every ten minutes (on its default schedule) and each time it runs it looks for transactions that can be cleared. You may well find that the clean up job is running for long periods, and that it is hitting the disk hard whilst achieving very little in terms of actual clean up.

Take a look at the following scenario:

DML changes are made in the published database
Changes get written to the replication tables
Changes get replicated to the subscriber
Cleanup looks to find entries to remove from the replication tables
Cleanup only removes a small subset of entries seemingly ignoring any recently replicated entries
More DML changes get written (back to step 1)

As you can see with busy systems the replication tables can quickly fill up, and if you are not using the correct publication settings you may well be holding information you don’t need to the detriment of replication and your system as a whole.

This series of articles discusses a number of techniques for reducing the size of the distribution DB and improving Transactional replication performance.

Part 1 – Only keep the data that hasn’t been synchronised

The first issue to check for is data being replicated correctly but not being cleared out of the distribution database afterwards. If this is the case it is being caused by the immediate_sync publication setting. This setting causes all transactions to be held for the full retention period of the distributor rather than just holding the transactions that haven’t been synchronised. This means that each time the distribution clean up job runs it will only be deleting entries older than the retention period. The default transaction retention period is 72 hours.

Distributor Properties

The excellent article by Paul Ibson (http://www.replicationanswers.com/TransactionalOptimisation.asp) covers how to identify and resolve this issue by setting the immediate_sync and allow_anonymous values, however this article only mentions altering a specific publication.

If you have multiple transactional publications for the same publisher you will need to ensure that you update each publication. Otherwise the problem will not be resolved.

The sp_MSdelete_publisherdb_trans procedure is called as part of the Distribution clean up process, and this procedure checks whether there are any publications with immediate_sync set. If so it will only delete commands and transactions older than the retention period regardless of whether there are any publications that aren’t set to immediate_sync.

Once you have changed the immediate_sync and allow_anonymous settings for every transactional publication you should execute the distribution clean up job. This may well take some time to run the first time, however, when it completes, when checking the job log you should see a large number of deletes. From this point on the job should be far quicker to execute and will actually clean up the MSrepl_transactions and MSrepl_commands tables.

It is important to note that the distribution cleanup mechanism treats the publications as a collective rather than individuals. So when it performs deletes it does it based on the collective rather than individual settings, however, the way that it does this is not obvious.

This will form the starting point for part 2 of the series.

The post Reducing the size of the distribution database & improving transactional replication performance: Part 1 appeared first on BI Design.

Book Review: Big Red - Voyage of a Trident Submarine

by Andy Warren

SQLServerCentral.com

Blogs

I've grown up reading Tom Clancy and probably most of you have at least seen Red October, so this book caught my eye when browsing used books for a recent trip. It's a fairly human look at what's involved in sailing on a Trident missile submarine...

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2009-03-10

1,439 reads

Database Mirroring FAQ: Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup?

by Robert Davis

SQLServerCentral.com

Blogs

Question: Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup? This question was sent to me via email. My reply follows. Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup? Databases to be mirrored are currently running on 2005 SQL instances but will be upgraded to 2008 SQL in the near future.

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2009-02-23

1,567 reads

Inserting Markup into a String with SQL

by Phil Factor

SQLServerCentral.com

T-SQL

In which Phil illustrates an old trick using STUFF to intert a number of substrings from a table into a string, and explains why the technique might speed up your code...

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2009-02-18

1,631 reads

Networking - Part 4

by Andy Warren

SQLServerCentral.com

Blogs

You may want to read Part 1 , Part 2 , and Part 3 before continuing. This time around I'd like to talk about social networking. We'll start with social networking. Facebook, MySpace, and Twitter are all good examples of using technology to let...

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2009-02-17

1,530 reads

Speaking at Community Events - More Thoughts

by Andy Warren

SQLServerCentral.com

Blogs

Last week I posted Speaking at Community Events - Time to Raise the Bar?, a first cut at talking about to what degree we should require experience for speakers at events like SQLSaturday as well as when it might be appropriate to add additional focus/limitations on the presentations that are accepted. I've got a few more thoughts on the topic this week, and I look forward to your comments.

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2009-02-13

360 reads

Reducing the size of the distribution database & improving transactional replication performance: Part 1

Symptoms:

Part 1 – Only keep the data that hasn’t been synchronised

Rate

Share

Share

Rate

Reducing the size of the distribution database & improving transactional replication performance: Part 1

Symptoms:

Part 1 – Only keep the data that hasn’t been synchronised

Rate

Share

Share

Rate

Related content

Book Review: Big Red - Voyage of a Trident Submarine

Database Mirroring FAQ: Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup?

Inserting Markup into a String with SQL

Networking - Part 4

Speaking at Community Events - More Thoughts

Cookies on SQLServerCentral