Advice on purging data

  • A portion of our OLTP DB has 12 years of data, but have clearance to purge all data over 2 years old. Through a long trial and error process we finally have a script that deletes data from 30 different tables and takes 12 hours in a test environment backed with SSD. We know doing this in live would have to be done at night incrementally for a couple hours at a clip over many weeks (or longer), which causes its own problems (no disabling keys, indexes, ALTER DB to simple, etc) nevermind reorganizing indexes and updating stats to keep the system fast. I've read about the idea of making a new table to copy the data into, but again, with 30 tables all while online does not seem like the best option.

    Which leads to my brainstorm (or fart depending on your opinion of the process):

    1. Full backup of production OLTP DB. Change backup planning to just Differentials and Log (disable the full backup temporarily)
    2. Restore full backup to the SSD computer (20 minutes)
    3. Run clean up script on the SSD OLTP DB (~12 hours)
    4. Take Prod OLTP DB Off line, log backup tail. (< 1 hour)
    5. Restore chain of differentials and Log backups since last full to SSD OLTP DB (1-2 hours)
    6. Run re-indexing/stats, shrink db reallocating space to 20% to SSD OLTP DB (1-2 hours)
    7. backup to SSD OLTP DB and restore to live, bring it all online (< 1 hour)
    8. Happy hour

    Of course we'd test the bejesus out of this process flow along with testing the cleansed database in a Dev/Test environment, but I hadn't read anything about this approach and wondered what you all thought.... what am I missing/failing to consider?

    And, we do have preset 4(ish) hour maintenance windows, so management would be OK with the 4 (to 6) hours of downtime.

  • Since you're only keeping ~1/6 of the data, you may want to consider a different approach.

    Backup up everything (of course!).  You might even want to make another copy of the data in the background, just in case.

    [The steps are not necessarily complete, but I've added numbers to allow easier commentary on specific step(s) w/o having to repeat the text.]

    1) Put the main db in "SIMPLE" mode (if it's not in SIMPLE already).
    2) Truncate the existing tables (drop FKs if necessary to allow truncates).
    3) [If compression is available to you, you may want to turn that on.  It will slow down the load, but speed up later processing.]
    4) Copy in the last 2 years of current data that you want to keep.
    5) Recreate FKs, if applicable.
    6) Put the main db back in "FULL" mode (if it was in FULL earlier).
    7) Take a complete backup of the new db.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • Hi Scott,

    I'd be concerned with staying within the maintenance window - steps 1 through 7 would need to happen within 4-6 hours. A step 0 of backup before 1-7 would ensure I could restore the db if it looked like I wouldn't make my window... I could see combining the approaches, doing your 1-7 on the SSD computer, then restoring it/ logs etc to live. So far, I like the idea of doing the data cleanse on another computer so I don't have to sweat how long it takes (within reason. I don't want to be on just Differentials/log backups too long).

    What pro's/con's do you see in my approach? Yours?

    Thank you for responding!

  • aquanut - Monday, December 10, 2018 12:42 PM

    Hi Scott,

    I'd be concerned with staying within the maintenance window - steps 1 through 7 would need to happen within 4-6 hours. A step 0 of backup before 1-7 would ensure I could restore the db if it looked like I wouldn't make my window... I could see combining the approaches, doing your 1-7 on the SSD computer, then restoring it/ logs etc to live. So far, I like the idea of doing the data cleanse on another computer so I don't have to sweat how long it takes (within reason. I don't want to be on just Differentials/log backups too long).

    What pro's/con's do you see in my approach? Yours?

    Thank you for responding!

    On the SSD computer is better, that's fine.  I was more concerned with deleting 5/6 of the data rather than just inserting 1/6 of the data.  DELETEs are bigger overhead than INSERTs anyway, and 5 times the volume would of course be much more costly still.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • Depending on other factors and volume of data following may also be a option.

    Mainly dependent on whether you can identify updates/deletes/inserts on the target tables easily.

    Add a new filegroup to that DB
    Create copies of the tables you need to purge on this FG and copy the recent data you wish to keep
    create all required indexes on them

    at go live put DB on single mode
    identify any DML done on the original tables and reflect them on their copy
    rename old tables to _old, rename new tables to original names

    Deal with any FK as needed
    change DB to multi user

    Optionally and depending on the volume of the remaining tables on this db rebuild them all on new FG.
    Could be done in chunks in advance for the smaller tables to save time at go live.
    If this is done at this point the old file will be empty and can be shrink without any delay.

    Then and if required rebuild all tables/indexes on the primary FG and drop the temporary one - or make the new one the default FG if you don't mind keeping both.

  • aquanut - Monday, December 10, 2018 11:39 AM

    A portion of our OLTP DB has 12 years of data, but have clearance to purge all data over 2 years old. Through a long trial and error process we finally have a script that deletes data from 30 different tables and takes 12 hours in a test environment backed with SSD. We know doing this in live would have to be done at night incrementally for a couple hours at a clip over many weeks (or longer), which causes its own problems (no disabling keys, indexes, ALTER DB to simple, etc) nevermind reorganizing indexes and updating stats to keep the system fast. I've read about the idea of making a new table to copy the data into, but again, with 30 tables all while online does not seem like the best option.

    Which leads to my brainstorm (or fart depending on your opinion of the process):

    1. Full backup of production OLTP DB. Change backup planning to just Differentials and Log (disable the full backup temporarily)
    2. Restore full backup to the SSD computer (20 minutes)
    3. Run clean up script on the SSD OLTP DB (~12 hours)
    4. Take Prod OLTP DB Off line, log backup tail. (< 1 hour)
    5. Restore chain of differentials and Log backups since last full to SSD OLTP DB (1-2 hours)
    6. Run re-indexing/stats, shrink db reallocating space to 20% to SSD OLTP DB (1-2 hours)
    7. backup to SSD OLTP DB and restore to live, bring it all online (< 1 hour)
    8. Happy hour

    Of course we'd test the bejesus out of this process flow along with testing the cleansed database in a Dev/Test environment, but I hadn't read anything about this approach and wondered what you all thought.... what am I missing/failing to consider?

    And, we do have preset 4(ish) hour maintenance windows, so management would be OK with the 4 (to 6) hours of downtime.

    This process won't work - once you bring the database on the SSD OLTP DB online you cannot restore the differentials/tlogs from production.  If you can take the application offline and keep it unavailable during the purge - then doing the purge directly in production shouldn't be an issue either.

    However - if you must keep the system online and functional - then your only option is to delete in small batches across all 30 tables.  I would review the delete script - if that is performing the deletes as a single operation for each table - change that to perform the deletes in batches.  In most cases - deleting in smaller batches will actually take less time than a single operation.

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • Jeffrey Williams 3188 - Monday, December 10, 2018 3:58 PM

    aquanut - Monday, December 10, 2018 11:39 AM

    A portion of our OLTP DB has 12 years of data, but have clearance to purge all data over 2 years old. Through a long trial and error process we finally have a script that deletes data from 30 different tables and takes 12 hours in a test environment backed with SSD. We know doing this in live would have to be done at night incrementally for a couple hours at a clip over many weeks (or longer), which causes its own problems (no disabling keys, indexes, ALTER DB to simple, etc) nevermind reorganizing indexes and updating stats to keep the system fast. I've read about the idea of making a new table to copy the data into, but again, with 30 tables all while online does not seem like the best option.

    Which leads to my brainstorm (or fart depending on your opinion of the process):

    1. Full backup of production OLTP DB. Change backup planning to just Differentials and Log (disable the full backup temporarily)
    2. Restore full backup to the SSD computer (20 minutes)
    3. Run clean up script on the SSD OLTP DB (~12 hours)
    4. Take Prod OLTP DB Off line, log backup tail. (< 1 hour)
    5. Restore chain of differentials and Log backups since last full to SSD OLTP DB (1-2 hours)
    6. Run re-indexing/stats, shrink db reallocating space to 20% to SSD OLTP DB (1-2 hours)
    7. backup to SSD OLTP DB and restore to live, bring it all online (< 1 hour)
    8. Happy hour

    Of course we'd test the bejesus out of this process flow along with testing the cleansed database in a Dev/Test environment, but I hadn't read anything about this approach and wondered what you all thought.... what am I missing/failing to consider?

    And, we do have preset 4(ish) hour maintenance windows, so management would be OK with the 4 (to 6) hours of downtime.

    This process won't work - once you bring the database on the SSD OLTP DB online you cannot restore the differentials/tlogs from production.  If you can take the application offline and keep it unavailable during the purge - then doing the purge directly in production shouldn't be an issue either.

    However - if you must keep the system online and functional - then your only option is to delete in small batches across all 30 tables.  I would review the delete script - if that is performing the deletes as a single operation for each
    table - change that to perform the deletes in batches.  In most cases - deleting in smaller batches will actually take less time than a single operation.

    You are 100% correct. I was hoping by starting with the backup the chain would stay intact, but it cannot be merged. Which on a level makes sense, just was hoping to get around this.

    Current thought process is:

    1. Production system into maintenance, single user, backup, take off line.
    2. Move backup to SSD environment run the delete script*, run backup
    3. Bring Online, Restore to Production

    *Scott's process is likely faster and can be done in place - but a whole other round of process testing and iterating through errors won't go over well with my bosses.

    Thank you again to all who responded.

  • I guess I would take a slightly different approach.

    Create a copy of the various tables you are purging.
    I would create views of the tables you are keeping with the WHERE clause configured to return the rows you want to keep. 
    Create a copy of the various tables you are purging.
    BCP out from the views
    BCP the data.into the temp tables
    Rename the existing tables. 
    Rename the temp tables. 

    An insert into the temp tables may work as opposed to BCP, but BCP is usually far faster and is not logged.

    Michael L John
    If you assassinate a DBA, would you pull a trigger?
    To properly post on a forum:
    http://www.sqlservercentral.com/articles/61537/

Viewing 8 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply