delete records without using transaction logs

  • I wanted to delete millions of records in the development enviornment based on some date every month. Is there any way to not use transaction log database. I know when we delete, it logs in the transaction log database and then to main database.

    If I don't use transaction log then it will be pretty fast for millions of rows.

    any idea will be appreciated.

  • If these rows make up your entire table, use truncate and you log will not grow.

    Truncate will not work if your table is parent table for others.

    Otherwise you may want to switch to simple logging and use delete-batches for the operation. Your logfile(s) will not grow if you keep these transactions small enough to fit into the current size.

    Declare @BatchSize int

    Set @BatchSize = 5000 -- Modify as needed !!!

    Set nocount on

    declare @RowsDeleted bigint

    Declare @MyRowcount bigint

    set @RowsDeleted = 0

    while 0 = 0

    begin

    DELETE top ( @BatchSize )

    FROM

    WHERE

    set @MyRowcount = @@rowcount

    if @MyRowcount = 0 break

    select @RowsDeleted = @RowsDeleted + @MyRowcount

    -- just to see it advancing ..

    -- % = modulo

    if 0.000 = @RowsDeleted % 100000.000 print convert(varchar(26), @RowsDeleted) + ' - ' + convert(varchar(26), getdate(),121)

    end

    Print '[' + cast(@RowsDeleted as varchar(15)) + '] Rows deleted.'

    Since size does matter, did you consider partitioning this table.

    That way you may be able to just drop a partition of data. (sliding window)

    Check BOL.

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • Thanks buddy,

    I like the idea of deleting in a batch in that case transaction log files will not grow.

    I cannot use truncate option because there is a condition associated with it.

    Good stuff.

  • In addition, deleting a smaller batch is faster. This helps to avoid blocking the other processes as the deletion will not hold the table for too long.

    May add waitfor 5 seconds in the while loop after each batch. Just give the other process a chance.

    Agree with ALZDBA, partitioning is a way to go for 2005.

  • I am trying to delete records in a batch and it's taking forever. there are 65 million rows in one table and I am deleting based on some condition and that table is joining with some other tables.

    I came up with another idea.

    1. First copy the data to flat file which we want to keep

    2. Truncate the table

    3. Import data from flat file to the table.

    This way is much faster.

    any suggestions guys. ?

  • This is what I have done in the past; using bcp to drop the data from the table into a file, truncate the table and then reimport the data into the table again.

    Tip - order the data when you are exporting it, in the order of the primary key; and then when you are importing it use the -h (hint) switch to define the order. This makes life much nicer.

    But as to whether or not it is quicker will depend on the volume of data you are working with.

  • balbirsinghsodhi (3/25/2008)


    I am trying to delete records in a batch and it's taking forever. there are 65 million rows in one table and I am deleting based on some condition and that table is joining with some other tables.

    I came up with another idea.

    1. First copy the data to flat file which we want to keep

    2. Truncate the table

    3. Import data from flat file to the table.

    This way is much faster.

    any suggestions guys. ?

    Indeed, it depends on the "fraction" of data you want to keep.

    maybe even a temptb may help out. (sort the data according your clustering key columns)

    And keep in mind to use an "order by [clust-key-col(s)]" if you insert the data back into the original table and to use bulkinsert if possible.

    Doing things this way, you may avoid an extra table maintenance (reindex) step.

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • This suggestion assumes that the volume of data which you wish to retain is reasonably small.

    1. Set simple recovery mode on the host database.

    2. Run sp_rename to rename the original table.

    3. Create a new table with the original table name without indexes.

    4. INSERT INTO the empty table those rows which you with to retain.

    5. Create your indexes on the new table.

    6. Run sp_createstats or manually create your stats.

    7. Grant privileges to match those of the original table along with any constraints.

    8. Drop the original table as long as you've coded to check @@ERROR=0.

    9. Reset to your original recovery mode.

  • Thanks guys.

    Here what I have done.

    1. Create temp table to store data which I have to keep

    My tempdb and Log is on separate drive and have lots of space.

    so space is not a problem.

    2. Truncate Real Table

    3. Drop all the indexes on Real Table

    4. Insert into Real table from temp table.

    5. Create indexes on Real table

    This is faster and doing good.

    Thanks for all your help. you guys rock.

    I have noticed one thing that playing with large data is not a joke.

  • Thats great.

    I would just like to note one thing about this solution. You can potentially lose the data that you want to save if your batch fails following the table truncate and before the insert of the data from tempdb into the previously truncated table has completed since the truncate deallocates pages by removing the pointers to those pages. So, if an page is reallocated it can't be reclaimed.

    Just a heads up.:)

  • balbirsinghsodhi (3/25/2008)


    I am trying to delete records in a batch and it's taking forever. there are 65 million rows in one table and I am deleting based on some condition and that table is joining with some other tables.

    I came up with another idea.

    1. First copy the data to flat file which we want to keep

    2. Truncate the table

    3. Import data from flat file to the table.

    This way is much faster.

    any suggestions guys. ?

    Another way maybe do the select into and keep the data you need, drop the original table and rename... I am sure others have suggested to you this too.

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply