Data Archiving in replicated database

  • Dear All,

    We have online database application running 24X7. There are 5 major tables and around 30 million rows (2 years data) on each table. The production database is being replicated to backup server using transactional replication.

    The application needs only 6 months data on the database. Currently, the DB has 2 years data. I am thinking to move 1.5 years data to another database. Since there are lots of non-clustered indexes, the delete runs too slow. When i put the database in our test lab, it takes 1 hr to delete 2.3 million rows in 1 table. (batch delete with 1sec wait delay). It may take around 20 hours to delete around 22 million rows in 1 table.

    My thoughts are

    1. Stop application during non-peak hour. Rename the original tables. create new tables, copy only 6 months data and then create necessary indexes. start application. Move the tables out of the database. But replication will break.

    2. Break replication. Run delete in small batches with wait for sec. So that application does not stop and my tran log wont explode.

    3. Stop application. Delete all the indexes (other than the index in where clause to delete) Replication ????

    Which is better way? or is there any other way to do this?

  • OR

    you can move unused data to different filegroup and put that filegroup in another drive. (Putting onto same drive where you already have operation mdf file, will affect performance).

    You can filter out the tables which need to be archived then create archive tables and move old data into that(like if table is employee then archive table is Arch_Employee).

    I think its easy to manage.

    Edit:-

    Also you can filter out your archived table not to be replicated.

    ----------
    Ashish

  • I have had a similar situation where my database was growing to nearly 1 TB and we were having some performance issues because a few of my tables were nearly 100 GB in size and because we were running on sql server 2000 it was quite difficult for us to perform the maintanence.

    So, what i used to do is similar to what you said.

    1.Transfer the required data to a new table with similar structure, rename the current table to some new archived table (Employee_Archived_2010) and then creating all the indexes as it is and then rename this by the original production table.

    Doing this proved to be a better idea to me and the downtime was also very little because we transfer comparatively very less data to the new table . Deleting the old data from the prodcution table is never been a good practice .

    But when you follow the above mentioned strategy pls make sure that all your depend objects are linked with the new table otherwise you might have to face serious issues specially if you have triggers associates with your tables\views.

  • Thanks Ashish & Sachnam. BTW Ashish, i like your quote 😀

    My problem is that, i need to break the replication. complete the tasks and delete the subscriber and create all new transactional replication.

    Without breaking replication, the delete will kill both publisher and subscriber with delete statements.

  • 🙂 thanks for confirmation that you too agree.

    -------------------------------------------------

    for the safe side, anyhow you have to take the downtime, no matter which strategy you choose.

    So will suggest, to draft your strategy first and then have a test of it(if you have testing environment) and then apply in production.

    Moving data from one table to other, dont think will take much time and redefining replication also should not take much time.

    In total I cant see more than 2-4 hrs downtime in this case if suggested approached.

    PS:- dont worry you improving the things and down time can easily be justified. Good Luck. Also dont forget to take a full backup before starting your activity. Is is just to make sure you can bring the database back to initial stage if anything goes wrong in between.

    ----------
    Ashish

  • Thanks Ashish.

    Yeah, i already i put the data in the lab. inserting into new tables (6 months data) is much easier than deleting 1.5 year data in the original table. And, replication does not take much time. Thanks guys.

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply