Data Archival

  • There is a requirement, details as below:

    1) Copy data from transaction table (around million records inserted daily) into a archival table

    2) Delete data from transaction table

    3) Step (1) - transactions can be minimally logged

    4) Step (2) - all transactions completely logged.

    5) There are 14 transaction tables; step (1) to (4) needs to be repeated for all transaction tables.

    Here is the sample code (for one table)

    -- assuming archival table already exists

    -- Step 1: Copy data into archivalTbl from transactionTbl with minimal logging

    INSERT INTO archivalTbl WITH (TABLOCK)

    (col1,

    col2,

    col3)

    SELECT col1,

    col2,

    col3

    FROM transactionTbl;

    -- Step 2: Delete data from transactionTbl will full logging

    DELETE FROM transactionTbl;

    Can someone please advise any best way to achieve this?

    Thank you.

    ~ Lokesh Vij


    Guidelines for quicker answers on T-SQL question[/url]
    Guidelines for answers on Performance questions

    Link to my Blog Post --> www.SQLPathy.com[/url]

    Follow me @Twitter

  • Lokesh Vij (9/19/2012)


    There is a requirement, details as below:

    1) Copy data from transaction table (around million records inserted daily) into a archival table

    2) Delete data from transaction table

    3) Step (1) - transactions can be minimally logged

    4) Step (2) - all transactions completely logged.

    5) There are 14 transaction tables; step (1) to (4) needs to be repeated for all transaction tables.

    Here is the sample code (for one table)

    -- assuming archival table already exists

    -- Step 1: Copy data into archivalTbl from transactionTbl with minimal logging

    INSERT INTO archivalTbl WITH (TABLOCK)

    (col1,

    col2,

    col3)

    SELECT col1,

    col2,

    col3

    FROM transactionTbl;

    -- Step 2: Delete data from transactionTbl will full logging

    DELETE FROM transactionTbl;

    Can someone please advise any best way to achieve this?

    Thank you.

    First, INSERT INTO will be fully logged, not minimally logged (iirc).

    So, let me get this straight, each table that needs to have records archived adds approximately 1,000,000 records each daily. Are you archiving that many records daily as well? What we really need to know is the structure of the data (DDL for the tables) so that we can make a determination of needs. If you are adding and deleting nearly 2,000,000 rows of data, then partitioning may be the what is needed for the archiving process. Depending on the data, it may be possible to establish a slinding window scenerio for quickly archiving data. This could also be done for adding data depending on how new records are added. If it is by a bulk process, a similar process can be used to quickly add the data as well.

  • Lynn Pettis (9/19/2012)


    First, INSERT INTO will be fully logged, not minimally logged (iirc).

    Hi Lynn

    I am using table locking option TABLOCK, this support minimal logging.

    ~ Lokesh Vij


    Guidelines for quicker answers on T-SQL question[/url]
    Guidelines for answers on Performance questions

    Link to my Blog Post --> www.SQLPathy.com[/url]

    Follow me @Twitter

  • Lynn Pettis (9/19/2012)


    If you are adding and deleting nearly 2,000,000 rows of data, then partitioning may be the what is needed for the archiving process. Depending on the data, it may be possible to establish a slinding window scenerio for quickly archiving data.

    +1

    With partitioning you will be able to truncate the data rather than deleting it from the table. So it will also avoid generating that much transaction log that it is generated with current DELETE operation.


    Sujeet Singh

  • Lokesh Vij (9/20/2012)


    Lynn Pettis (9/19/2012)


    First, INSERT INTO will be fully logged, not minimally logged (iirc).

    Hi Lynn

    I am using table locking option TABLOCK, this support minimal logging.

    Care to show us a reference that supports this assertion? I haven't been able to find one, but I'll keep looking.

  • Lynn Pettis (9/20/2012)


    Lokesh Vij (9/20/2012)


    Lynn Pettis (9/19/2012)


    First, INSERT INTO will be fully logged, not minimally logged (iirc).

    Hi Lynn

    I am using table locking option TABLOCK, this support minimal logging.

    Care to show us a reference that supports this assertion? I haven't been able to find one, but I'll keep looking.

    Looks like I found one in Books Online:

    When importing data into a heap by using the INSERT INTO <target_table> SELECT <columns> FROM <source_table> statement, you can enable optimized logging and locking for the statement by specifying the TABLOCK hint for the target table. In addition, the recovery model of the database must be set to simple or bulk-logged. For more information, see INSERT (Transact-SQL).

    So this, plus your comments, tells me that your table is a heap (no clustered index) and you are using simple or bulk-logged recovery model on your database.

  • Hi Lynn

    Very rightly pointed out that my table is actually a heap, without clustered index (I should have been more specific earlier). Just to add on, the transaction table does not have date field. So partitioning (implementing sliding window partition) is not feasible.

    Kindly suggest some better method.

    One more thing, I am not very sure if SQL server allows bulk insert (using "BULK INSERT" keyword) from one table to another? Definitely this feature is available when importing data from a flat file (and other variants) to a table.

    I really appreciate your help on this.

    Thank you.

    ~ Lokesh Vij


    Guidelines for quicker answers on T-SQL question[/url]
    Guidelines for answers on Performance questions

    Link to my Blog Post --> www.SQLPathy.com[/url]

    Follow me @Twitter

  • Still like to see the ddl for the table snd some sample data.

  • Lokesh Vij (9/19/2012)


    There is a requirement, details as below:

    1) Copy data from transaction table (around million records inserted daily) into a archival table

    2) Delete data from transaction table

    3) Step (1) - transactions can be minimally logged

    4) Step (2) - all transactions completely logged.

    5) There are 14 transaction tables; step (1) to (4) needs to be repeated for all transaction tables.

    What is the frequency of this process? You need to do it every day or is it done on monthly basis?

    If it is a monthly activity or done once every 3-4 months, I guess I would have tried below:

    1. Generate the script of the transaction tables & store it.

    2. At the time of archiving, I would like to just rename all the transaction tables to some name like TransationTableName_MM_DD_YY.

    3. Run the script generated in the first step to make new transaction tables which will now hold the new data.

    4. Now insert all the data that has to be archived into archive table from the transaction table that we just renamed as TransactionTableName_MM_DD_YY.

    5. Once archiving is done, you can just truncate the old transaction table i.e. TransactionTableName_MM_DD_YY

    Benefits:

    1. When inserting the data in archive table we are not touching the newly created transaction tables. We will be using the tables that we renamed as TransactionTableName_MM_DD_YY. This should make the insertion process much better.

    2. Once archiving is done, instead of deleting the data we can just truncate the table that we renamed as TransactionTableName_MM_DD_YY. This will make this step very fast (truncate will take seconds when delete will take hours!).

    P.S.: This is just an idea in my mind right now, you need to evaluate if this can be performed in your environment.


    Sujeet Singh

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply