Degraded Replication Performance

  • Hi,

    I'm running a SQL 2005 publication that is partitioned and merge replicated to 354 separate subscribers. Each partition is a unique subset of the master data.

    Recently I have started having very slow replication performance. I am seeing a lot of failures, with errors such as:

    "The merge process was unable to create a new generation at the 'Publisher'. "

    "The merge process was unable to change generation history at the 'Publisher'."

    "The merge process was unable to access row metadata at the 'Subscriber'.

    When I look at the longest running transactions on the server, I'm seing huge lock numbers for some transactions. At the minute I have a single subscriber holding 1.3m locks across a range of tables (OBJECT/PAGE with Intent Exclusive, KEY with Exclusive). It is holding an open transaction that has been running for 6 hours. Usual replication times are from 10 minutes to 2 hours. A profiler trace on the deadlock graph and lock acquired/cancel/released/deadlock/deadlock chain events shows nothing untoward, just the usual chatter of acquired/released.

    Checking the subscriber via Replication Monitor shows that it is hanging on "Upload 99% complete". This is a common feature for the long running replications. They seem to get almost there, then hang for ages, then error with one of the messages above.

    From what I've read around the web, I'm thinking this might be a blocking issue? Specifically contention on the replication system tables. Perhaps related to a mismatch between msmerge_genhistory and msmerge_contents? See threads below as examples:

    http://social.msdn.microsoft.com/Forums/en-US/sqlreplication/thread/7da7efa0-52d4-48e6-8a7f-e3579f48c5ac/

    http://social.msdn.microsoft.com/forums/en-US/sqlreplication/thread/ea00420d-643b-4aef-aea5-345e6a9c4e4b/

    All merge agents are on a slow link profile

    Agent query timeout is at 65534

    generation_levelling_threshold is set to 0 at both subscribers and publisher

    Publisher config from sysmergepublications:

    publisherXXX

    publisher_dbXXX

    nameXXX

    descriptionXXX

    retention18

    publication_type1

    pubid8E72F963-2C85-4005-BC5F-8494A1BE23BE

    designmasterid8E72F963-2C85-4005-BC5F-8494A1BE23BE

    parentid8E72F963-2C85-4005-BC5F-8494A1BE23BE

    sync_mode0

    allow_push1

    allow_pull1

    allow_anonymous0

    centralized_conflicts1

    status1

    snapshot_ready1

    enabled_for_internet0

    dynamic_filters1

    snapshot_in_defaultfolder0

    alt_snapshot_folderXXX

    pre_snapshot_scriptXXX

    post_snapshot_scriptXXX

    compress_snapshot0

    ftp_addressNULL

    ftp_portXXX

    ftp_subdirectoryNULL

    ftp_loginANONYMOUS

    ftp_passwordNULL

    conflict_retention18

    keep_before_values0

    allow_subscription_copy0

    allow_synctoalternate0

    validate_subscriber_infoHOST_NAME()

    ad_guidnameNULL

    backward_comp_level90

    max_concurrent_merge0

    max_concurrent_dynamic_snapshots0

    use_partition_groups1

    dynamic_filters_function_listHOST_NAME()

    partition_id_eval_procMSmerge_evalpartid_sp_8E72F9632C854005

    publication_number1

    replicate_ddl1

    allow_subscriber_initiated_snapshot1

    distributorXXX

    snapshot_jobid0x31B5B507C7084542BC1BF618D874FDAF

    allow_web_synchronization0

    web_synchronization_urlNULL

    allow_partition_realignment1

    retention_period_unit0

    decentralized_conflicts1

    generation_leveling_threshold0

    automatic_reinitialization_policy0

    Subscriber @@version: Microsoft SQL Server 2005 - 9.00.4053.00 (Intel X86) May 26 2009 14:24:20 Copyright (c) 1988-2005 Microsoft Corporation Express Edition with Advanced Services on Windows NT 5.1 (Build 2600: Service Pack 2)

    Publisher @@version: Microsoft SQL Server 2005 - 9.00.4262.00 (X64) Aug 13 2009 17:06:39 Copyright (c) 1988-2005 Microsoft Corporation Standard Edition (64-bit) on Windows NT 5.2 (Build 3790: Service Pack 2)

    I'm really stuck here, any advice on a strategy for investigating these problems would be hugely appreciated. You may guess that I'm not (usually) a DBA.

    Thanks, Iain

  • Not sure if this will help but found these on SP3 fixes today - not sure on versions of your subscriber.

    http://support.microsoft.com/kb/955706

    959023 (http://support.microsoft.com/kb/959023/ ) FIX: A blocking issue occurs when you run the Merge Agent in SQL Server 2005

    959024 (http://support.microsoft.com/kb/959024/ ) FIX: When the Merge Agent synchronizes multiple batches of changes, the synchronization may take a long time to finish or the synchronization may expire unexpectedly in SQL Server 2005

    Version info:

    SQL Server 2005 Service Pack 4 9.00.5000.00

    SQL Server 2005 Service Pack 3 9.00.4035

    SQL Server 2005 Service Pack 2 9.00.3042

    SQL Server 2005 Service Pack 1 9.00.2047

    SQL Server 2005 RTM 9.00.1399

  • That looks very interesting Jamie, thanks.

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply