Delete large volume of records by divide in to small chunks

  • Hi ,

    I want to implement purging process in our production environment. For that i am dumping the transactions into “ Purge_work “ table and then deleting the records by referencing them .

    Below is the code for the same

    --Below is the code which I am using to implement purging process

    --Declaring the variable

    Declare @TrxId char(11),

    @Count int

    -- get the tranactionID values in to cursor from Purge_work

    Declare delCursor Cursor For

    Select TrxId From Purge_work Order By TrxID Desc

    --Open Cursor fetch the one one record

    Open delCursor

    Fetch Next From delCursor Into @TrxId

    Set @Count = 0

    While @@FETCH_STATUS = 0

    Begin

    -- The “sp_DeleteTransaction” Proc will delete the records corresponding tables

    Exec [sp_DeleteTransaction]

    @cTrxID= @TrxId

    , @szUserName= 'Pur_id'

    , @iLogLevel= 3 -- these are our system log variables don't consider this param

    , @iSystemEnum= 34 -- these are our system log variables don't consider this param

    Fetch Next From delCursor Into @TrxId

    End

    Here my question is “ Purge_work “ table has millions of records. I need to implement a process that will divide all those records in to small chunks and delete those chunks one by one so that my transaction log file will not suddenly grow and occupy more memory.

    Can anyone help to develop the script ?

  • kiran08.bi (9/26/2016)


    Hi ,

    I want to implement purging process in our production environment. For that i am dumping the transactions into “ Purge_work “ table and then deleting the records by referencing them .

    Below is the code for the same

    Here my question is “ Purge_work “ table has millions of records. I need to implement a process that will divide all those records in to small chunks and delete those chunks one by one so that my transaction log file will not suddenly grow and occupy more memory.

    Can anyone help to develop the script ?

    Doing 50 updates for 100 records would take up the same log space, or maybe more, as 1 update of 5000 records.

    The only way to stop the log from ballooning is to stagger them over a day and have log backups inbetween, if you are using Full Recovery model.

    I use batching to prevent table locks on live system, making sure batchsize is smaller than the the lock escalation threshold.

    Here is a way I do cleanups. I am sure you can customise it to fit your purpose.

    CREATE PROCEDURE [Operations].[CleanupMyTable]

    (

    @BatchSize SMALLINT

    ,@DaysToRetain SMALLINT = 31

    )

    AS

    DECLARE @Now DATETIME2(0) = SYSUTCDATETIME();

    DECLARE @DeleteDateUntil DATETIME2(0) = DATEADD(d, (@DaysToRetain * -1), @Now);

    IF (ISNULL(@DaysToRetain,0) < 31) -- do not allow less then 31 days

    BEGIN

    RAISERROR ('ERROR - DaysToRetain value passed is too low', 16, 1);

    RETURN;

    END;

    CREATE TABLE #MyTable

    (

    MyTableId INT NOT NULL

    ,Deleted BIT DEFAULT 0

    ,CurrentBatch BIT DEFAULT 0

    PRIMARY KEY CLUSTERED (MyTableId)

    );

    INSERT INTO #MyTable(MyTableId)

    SELECT

    MT.MyTableId

    FROM

    MyTable AS MT (NOLOCK)

    WHERE MT.LastModified < @DeleteDateUntil;

    UPDATE TOP (@BatchSize) #MyTable

    SET

    CurrentBatch = 1;

    WHILE EXISTS (SELECT 1 FROM #MyTable WHERE Deleted = 0)

    BEGIN

    WAITFOR DELAY '00:00:00.5'; --Prevent continuous blocking and should be compulsory in any batch delete process.

    DELETE MT

    FROM

    MyTable MT

    INNER JOIN #MyTable AS MTd ON MTd.MyTableId = MT.MyTableId

    AND MTd.CurrentBatch = 1;

    UPDATE #MyTable

    SET

    CurrentBatch = 0

    ,Deleted = 1

    WHERE

    CurrentBatch = 1;

    UPDATE TOP (@BatchSize) #MyTable

    SET

    CurrentBatch = 1

    WHERE

    Deleted = 0;

    END;

    Catch-all queries done right [/url]
    Gail Shaw's Performance Blog[/url]

  • You will find some good example code as well as some other considerations in the discussion HERE.

    Do NOT follow my example of a loop with DELETE TOP(5000).

    If you are deleting rows from a live table during business hours, you are better off walking through the table by a range of keys at a time. In your example, I assume [TrxId] is the primary key for the Purge_Work table. So write your loop to delete rows from [TRX_ID] 1 - 10000 then 10001 to 20000 etc. You are going to be replacing or reworking procedure [sp_DeleteTransaction] to handle multiple deletions instead of one at a time. If you are unfamiliar with the OUTPUT clause, you need to read up on it before coding the new version.

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • Why not simply TRUNCATE the table?

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply