delete large chunk of data on replicated DB

  • Hi,

    I have 4 SQL Servers 2008 with a merge replication. Replication runs every 10 minutes.

    Now, my client asked me to delete a large amount of data (almost 2 years of data). What's the best practice? Deleting the data on the published DB and wait for the replication to do its job (which can take a while...), stopping the replication and launch the delete script on the 4 servers simultaneously and setup the replication once it's finished?

    any advice would be really appreciate.

    thanks a lot for your time and help

  • What is the size of the database?

    What is the speed of the connection between servers?

    What is the record count to be deleted?

    Those answers will help to understand a better approach.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • SQLRNNR (7/16/2014)


    What is the size of the database?

    database is about 3 gig

    What is the speed of the connection between servers?

    one of the server is far away and connection is not that fast (Fair, according to the replication monitor)

    What is the record count to be deleted?

    don't know, but in the millions, dispersed in several tables

    Those answers will help to understand a better approach.

    thanks for your help

  • What percentage of the data does the part to be deleted represent? For example, if the database has 100 years of history, then 2 years is a relative pittance. If the db has three years of data total, then 2 years is huge.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • ScottPletcher (7/16/2014)


    What percentage of the data does the part to be deleted represent? For example, if the database has 100 years of history, then 2 years is a relative pittance. If the db has three years of data total, then 2 years is huge.

    right now, it represents about 50% of the database

  • So a total database size of 3gb is fairly small.

    With the affected rows to be roughly half of the database, that is about 1.5GB which is still pretty small.

    Take into account a slower link to one of the replication partners and not wanting to overwhelm that pipe, I would batch process it from a server that is deemed to be the central/master server.

    I would create a process that would delete 50k-100k records at a time. Put that in a sql agent job and then have it run once every 15minutes (in this case due to the slower wan link). This should be done relatively quickly, but the delay in between runs is just to give time to let the system process everything, clear buffers and breath.

    You can tinker with the timings of the run since the 15min is just a starting point.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • You could go the redo route if you could afford some brief downtime. Stop replication, delete the data and rebuild indexes, then re-initialize the replication with a fresh snapshot.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • thanks a lot for the suggestions, I'll go that way!

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply