Delete process

Question

Post reply

Delete process

David Kranes

SSCrazy

Points: 2523
More actions
February 19, 2009 at 12:29 pm

#383975

Hello, I am in a situation where on a daily basis I need to reload a table in which some of the data will remain. Between 1/3 and 1/2 of the data will be purged daily which usually amounts to several millions of rows. We are deleting where division_id equals a specified value and it only takes a couple of minutes to accomplish this. The issue is the transaction log grows to over 1.6GB regardless of whether I am in Simple or Full recovery mode during this process. I thought Simple would reduce this a lot.
Is it safe to truncate the log prior to and after the load process which includes the delete without losing the ability to do a point in time recovery?
So the steps would be:
1. truncate the log???
2. delete from outlet_test where division_id =
3. Bulk Insert flat file into the table
4. truncate the log
here is the table definition:
CREATE TABLE [outlet_test](
[retailer_id] [int] NOT NULL,
[sku_id] [varchar](24) NOT NULL,
[outlet_id] [varchar](20) NOT NULL,
[division_id] AS (left([outlet_id],(3))) PERSISTED NOT NULL,
[threshold] [int] NOT NULL DEFAULT ((0)),
[qty] [int] NOT NULL DEFAULT ((0)),
[sku_price] [int] NULL,
[tax_rate] [numeric](5, 3) NULL,
[sell_thru] [numeric](8, 3) NULL),
CONSTRAINT [pk_outlet_test] PRIMARY KEY CLUSTERED
(
[sku_id] ASC,
[outlet_id] ASC,
[retailer_id] ASC,
[division_id] ASC
)
Best Regards,
~David

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply

noeld SSC Guru Points: 96590 More actions · Answer 1

You probably want to use SELECT INTO, then Drop the old table and finally recreate index/PK

* Noel

David Kranes SSCrazy Points: 2523 More actions · Answer 2

This will not work because select into will not give me the true definition of the table, specifically the division_id which is a computed column. I suppose I could create the table and then insert into it and then create the primary key index.

Are there any other ways you can think of?

Thank you for your input.

Best Regards,

~David

Carleton SSCommitted Points: 1706 More actions · Answer 3

We got around this issue by deleting the records in slices:

DECLARE @DeleteIncrament INT

SET @DeleteIncrament = 10000

WHILE 1 = 1

BEGIN

DELETE TOP (@DeleteIncrament) FROM Table WHERE where_clause

IF @@RowCount = 0 BREAK

END

maechismo_8514 SSCoach Points: 17510 More actions · Answer 4

Yep, always delete the rows in chunks for every 1000 rows or 10,000 rows and then after deletion process is done you can set up a job to Shrink the log file using : DBCC SHRINKFILE

and also you can set your reocvery model to simple as you stated before if you dont need point in time recovery

noeld SSC Guru Points: 96590 More actions · Answer 5

David Kranes (2/19/2009)
This will not work because select into will not give me the true definition of the table, specifically the division_id which is a computed column. I suppose I could create the table and then insert into it and then create the primary key index.
Are there any other ways you can think of?
Thank you for your input.

The tip about SELECT INTO is that it is a minimally logged operation but if you try to add the calculated column as an after the Fact thing you will end up scanning the table twice and generating as much log activity.

The idea of batching the deletes works IF you interlace it with Transaction log Backups or your DB is in Simple Recovery mode.

There is also the idea of using Partition switching to take part of the table in and out the main and drop the exiting partition. I have not played with it long enough but if it delivers on the MS promise that could be your solution.

* Noel

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 6

Just remember that table partitions are only available in the Enterprise Edition.

David Kranes SSCrazy Points: 2523 More actions · Answer 7

I am using the batch method with 100000 row chunks and it is working very well. There is very little to no logging occurring. My question is with this process occurring every day could the table become fragmented? Is there any maintenance I should add to the routine to prevent this or any other potential issues? I already update statistics at the end of the data load. This is a customer database and we do not manage their backups so I can't play with the recovery mode.

Best Regards,

~David

Steve Jones - SSC Editor SSC Guru Points: 734459 More actions · Answer 8

It can fragment, and you'd need to rebuild the clustered index to fix this. In EE you can do this online.

David Kranes SSCrazy Points: 2523 More actions · Answer 9

Steve, what is the best way to schedule this say weekly? Is it best to use ALTER INDEX, DBCC DBREINDEX or some other method? Thank you.

Best Regards,

~David

Steve Jones - SSC Editor SSC Guru Points: 734459 More actions · Answer 10

How big is the db?

A blanket, weekly schedule can work, but it can use more resources than needed. There are lots of scripts on this site that calculate fragmentation and then only reindex those items. I'd also schedule this for a low time.

DBCC is a good way to do it, but be smart about how to call DBCC and for which tables. Have an idea of how long it will take and the impact to your system, which means testing.

David Kranes SSCrazy Points: 2523 More actions · Answer 11

Steve, there are only 2 tables that are heavily deleted from and inserted into. those are the ones that would need any type of defragmenting on a somewhat regular basis. Thank you kindly for all your help.

Best Regards,

~David

maechismo_8514 SSCoach Points: 17510 More actions · Answer 12

David Kranes (2/20/2009)
Steve, there are only 2 tables that are heavily deleted from and inserted into. those are the ones that would need any type of defragmenting on a somewhat regular basis. Thank you kindly for all your help.

what is the size of each table? because fragmentation is quite normal for smaller tables and its well over 65% or 70%.

check the fragmentation percentage after inserting. Ideally over 30% rebuild then or defrag them.