Data archive techniques

  • Please share your experiences on archiving older data from tables that have grown very large.

    We've looked at several approaches including partitioning, but management seems to want the archived data in a new database on a new instance of sql on the same hardware. So data would be moved to archive, then deleted from the primary database tables. The application code would have to be modified to accomodate this. The hardware/servers will be upgraded as needed to handle the new instance.

    We ( DBAs ) had proposed either table partitioning within the same database, or to a second database on the same instance.

  • we have done on certain tables by moving the old data say year or more to another db and changed only reports to take data from that location.

    monthly job will push the dates year old to the archive DB.

    in my case it is fine analyse before implement anthing on production.

    Regards
    Durai Nagarajan

  • I've been looking at all the various techniques for partitioning on sql 2005 and later, but my specific question was really whether an application could query against both the primary sql server instance and a separate instance that might contain the archive database and tables. At this point I'm not sure what the advantage of putting the archived data in a separate instance would be, as opposed to just in a separate database on the same instance --- as long as the hardware is beefed up to support everything.

  • You could use linked servers and synonyms to access the data from another database in another instance on the same server (doesn't that sound convoluted). I personally don't see an advantage to this based on what you have posted. I would agree to moving the data to another database on the same instance, but why not just keep the archive tables in the current database?

    You can use partitioning to simplify the moving of data from the primary tables using the sliding window type processing.

    Then moving the data to another table or database can be done without impact to users accessing the primary tables.

  • Thanks Lynn. I'm not sure what will eventually shake out --- management is meeting on this today. DBA management seems to want a simple solution where data is moved out of primary tables more or less manually into archive tables ( subsequently deleting from primary table ) and have the application code changed to accomodate. Since there is virtually no business logic in the database to date ( all in C# .net code using ORM ), DBA management may fear creating partitioning, views etc that might break if schema is changed. We are currently inserting professional DBA influence that has never been here before -- all developer-driven up till now.

  • A partitioned table is a table, in fact starting in SQL Server 2005 all tables are actually partitioned with a single partition. It should have no affect on the application itself, just make it easier to manage the data, particularly move old data out if the table is partitioned correctly.

  • can be done for a date range or so by having multiple partitions

    As lynn specified if it is done correctly everything will be good.

    Regards
    Durai Nagarajan

  • The only reason to not use partitioning for access speed is if your entire schema is ORM controlled (I've seen this, it's not pretty) or if you're concerned about restore speed, and thus need that data in a different 'shell'. Even then, with proper use of filegroups and the like, it's not as big a concern but you do need to wrap your head around filegroup restores.

    However, the end result comes down to how often is this 'old data' going to be accessed? If it's common and will be 'mixed in' with current data, you've got a real weapon to use for your organizational preference, as partitioned builds will behave better.

    If it's once a year, at most, and is just there for historical legal happiness, meh, screw it, no biggie. Ship it out and keep 'em happy. If you're trying to bring sanity to the environment you're going to have a lot of 'hills' to fight on. Is this one really worth spending that political coin on?


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • The application does use an ORM. What is being proposed by our Senior DBA is partitioning along with with a view using union all to cover the primary table and archived version of the table. The view will have the same name as the original base table and triggers will cause inserts/deletes to be redirected away from the view to the base table ( new name ) or archive table.

  • Indianrock (8/22/2012)


    The application does use an ORM. What is being proposed by our Senior DBA is partitioning along with with a view using union all to cover the primary table and archived version of the table. The view will have the same name as the original base table and triggers will cause inserts/deletes to be redirected away from the view to the base table ( new name ) or archive table.

    +1. I like this plan. This is an effective use of the structure. 🙂

    EDIT: You just need to be careful to 'unhook' the ORM from being allowed to make schema-level changes to the underlying object. Which ORM are you guys running, nHibernate?


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • I've seen references to nHibernate but not really sure which ORM. I've never heard of the application/ORM making schema changes. Now this Agile Development shop does do frequent code releases and some of those touch schema, but not the day to day queries coming from clients via the application. If you're hinting at what happens if development schema changes break the views/triggers -- I had the same question. We're trying to turn the corner from "development drives everything" to DBA involvement, but the car is now practically up on two wheels making that turn. 🙂

  • Indianrock (8/22/2012)


    The application does use an ORM. What is being proposed by our Senior DBA is partitioning along with with a view using union all to cover the primary table and archived version of the table. The view will have the same name as the original base table and triggers will cause inserts/deletes to be redirected away from the view to the base table ( new name ) or archive table.

    why not union only when the old data is required say using a SP and union it only when the date range goes beyond certain limit.

    Regards
    Durai Nagarajan

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply