• Jeff Moden (5/6/2013)


    Eric M Russell (5/6/2013)


    mah_j (5/4/2013)


    Hi

    I want to archive the data of a 400G table,what is the best way of archiving?

    My table has a clustered index on 2 fields (int,datetime).We just do select on last 2 years and insert in this table,and for the old data we do select seldom for examplejust 5 in a month,and no insert.

    I want to create a table and insert the old data into it and create indexes for both tables,Is it a pointless act?

    How can I archive the old data ?

    Sometimes physically partitioning the rows into seperate tables or partitions makes sense. However, indexing is also a form of partitioning. This problem could be solved by effective clustering of rows in the table and indexing on transaction date.

    Selecting a range of rows from a table filtered by clustered key is generally very fast, even with 100 million+ rows, unless the table is heavily fragmented. Check the level of fragmentation on the table. From what you're described, the table should be clustered based on a sequential ID or insert date/time for optimal querying and minimal fragmentation.

    Also, experiment with a filtered index on whatever column is used for transaction date. You can even periodically drop / recreate this index using a different cutoff date.

    For example:

    create index ix_current on Sales_History ( Date_Of_Sale, Product_ID ) where Date_Of_Sale >= '2012/01';

    To wit, the current clustered index on int,datetime should be reversed to be datetime,int and it should be UNIQUE, as well, to get rid of the 8 byte uniquifier that will be added without it.

    And yet another addition... If you decide to move all of the data to its own database and make it read only, don't move all of the data at once. IT will make your log very very big very very fast 🙂 Do it in batches that make sense for the size of your log. I always start with batches of 10,000 for these things and do it in a loop (yes, one of the times a loop is helpful). We actually have our archives set up in 100GB databases on one server where we simply create a new database at night and make it the "active" database and then make the previous read_only. I can explain more if you are interested in this, but I don't believe it is a good fit for most scenarios.

    Jared
    CE - Microsoft