deleting 22 million records

  • Hi All,

    I am having a stored procedure which will delete 22 millions of record.

    Is there any quickest approach to deleting these records from the tables.

    below is the query is used for deleting.

    DELETE FROM tblHistory WHERE id = @id AND TransDate < GetDate() - 30

    Thanks in advance for any info.

    Regards,

    Mohanraj Jayaraman

  • The best thing I can think of...

    If the number of records you will have left is significantly less than the number you are deleting.

    Look into selecting what you want to keep into a temp table and then truncating the table you want to delete from.

    I have an ISA Log database I maintain by having a current table and and old table. Once a month I drop the old table, renamed the current table to old and recreate the current table. Since we are not required to keep more than 30 days old... I found this worked better than any kind of data moving.

    In order to make this work and not kill ISA... I put an instead of trigger onto the actual isa log table. Catch all the records and put them in current. When I moved that databases around I disabled the trigger, drop the old table, rename the current to old, create the new, and reenable the trigger.



    --Mark Tassin
    MCITP - SQL Server DBA
    Proud member of the Anti-RBAR alliance.
    For help with Performance click this link[/url]
    For tips on how to post your problems[/url]

  • If you're going for the straight delete - make sure you're doing it 'out of hours' as this is going to tablelock until the delete's complete - which could be some time. If you don't have enough space for the logs to expand to and the transaction rolls back as a result, it's going to stay locked throughout the rollback too (as would te rest of the db - but I'm assuming a table that big's probably critical anyway).

    To bite it off in smaller chunks which are unlikely to lock the table you could try something along the lines of

    DECLARE @datToDate DATETIME

    SET @datToDate = WHATEVERDATE

    WHILE (SELECT Count(*) FROM TABLENAME WHERE CRITERIA) <> 0

    BEGIN

    DELETE FROM TABLENAME

    WHERE ID IN

    (

    SELECT TOP 2999 ID

    FROM TABLENAME

    WHERE CRITERIA

    )

    END

    and kick it off as an overnight job.

    You could even take in lighter bites by breaking up your overnight clearing up into smaller chunks by hardcoding the date value and amending or assigning it using a select - say min date plus 1, 2 or 3 months

  • Mohanraj (9/29/2008)


    Hi All,

    I am having a stored procedure which will delete 22 millions of record.

    Is there any quickest approach to deleting these records from the tables.

    below is the query is used for deleting.

    DELETE FROM tblHistory WHERE id = @id AND TransDate < GetDate() - 30

    Thanks in advance for any info.

    Regards,

    Mohanraj Jayaraman

    If I were you, then I would create a temp table in database and move all records of last 30 days into it and then drop original table and then rename temp table to original table.

    I assume that you will definitely have much less records to insert than to delete.

    SQL DBA.

  • Could break it up into smaller chunks using top(x) with a loop around it until no more records are deleted. I've used this with some success archiving data, had about 200 million records, did a top of I think 1000 records (might have been more, did it to the point one chunk delete took about 20 seconds), then a short waitfor delay at the end to give any other process time to get in and do it's job.

  • Thanks to everyone for give me the best approach.

    However, I would like to follow the practice while will delete the records in the loop..

    by incrementing 10000 records at each loop execution.

    Thanks onceagain to all for given the reply.

  • 10, 000 records will escalate to a table lock

  • True. It can, it can also happen a lot sooner than that. Point is to break it up into chunks that would not impact business. Which is why when I did mine I experiented to get it down to minimal impact, and gave the waitfor delay to give any statement that is waiting for a resource time to execute, since even if that is a longer than the delay query, the delete query should not cause a table lock as long as another query has a lock on it.

    (locking mechanism is way beyond a simple post to put the entire rules in though.)

  • DELETE FROM tblHistory WHERE id = @id AND TransDate < GetDate() - 30

    Am I missing something? If ID is a primary / unique key it will delete 1 row ...

    Wilfred
    The best things in life are the simple things

  • SanjayAttray (9/29/2008)


    If I were you, then I would create a temp table in database and move all records of last 30 days into it and then drop original table and then rename temp table to original table.

    I assume that you will definitely have much less records to insert than to delete.

    Sanjay, my take exactly, although bear in mind if the table has keys (FGN or Pri) or indexes they will need re creating

    -----------------------------------------------------------------------------------------------------------

    "Ya can't make an omelette without breaking just a few eggs" 😉

  • Anders Pedersen (9/29/2008)


    True. It can, it can also happen a lot sooner than that. Point is to break it up into chunks that would not impact business. Which is why when I did mine I experiented to get it down to minimal impact, and gave the waitfor delay to give any statement that is waiting for a resource time to execute, since even if that is a longer than the delay query, the delete query should not cause a table lock as long as another query has a lock on it.

    (locking mechanism is way beyond a simple post to put the entire rules in though.)

    Decent guide to 2005 lock escalation on this blog

    http://blogs.msdn.com/sqlserverstorageengine/archive/2006/05/17/Lock-escalation.aspx

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply