August 29, 2007 at 4:23 pm
Hi All,
I have very wide 42 million row table that I need to start cleaning up.
I can copy the whole table off to another system for archival purposes, but am now trying to figure out how to write a procedure that will delete say a thousand rows at a time and then commit the transaction so as to not lock up our production system for an eternity. 
I wan to schedule and let it run for days upon days... 
Help! I'm a crappy DBA! 
[font="Verdana"]-- Wally
-- Reluctant DBA[/font]
August 29, 2007 at 6:55 pm
I'm not sure I understand your situation. You already have copied out all the data and now want to empty the table but leave it there?
August 29, 2007 at 8:53 pm
The table currently has 5 years of data in it. We want it to only have 60 days of history moving forward.
So I need to delete about 40 million rows, a few thousand at a time with a commit so I don't lock up the production system for hours and hours...
[font="Verdana"]-- Wally
-- Reluctant DBA[/font]
August 30, 2007 at 7:41 am
There are a few options that I can think of.
1. Create a copy of the empty table, copy in the last 60 days data, rename the tables so that the new one is the live and the old one is the archive, delete the last 60 days data from the archive. This solves the initial archive problem but doesn't do anything for the ongoing archive.
2. Create archive table. In a procedure, populate a temp table (or table variable) with a distinct list of dates older than 60 days. Loop through this table one day at a time and archive/delete the rows. Add a pause in the processing to allow other uses a chance to access the table.
3. Similar to 2. Create achive table and insert archive data. Set rowcount to 10,000 (or other batch size) and loop while the number of rows deleted is not zero:
declare @continue char(1)
set rowcount 10000
set @continue = 'Y'
while @continue = 'Y' begin
-- delete statement
if @@rowcount = 0 set @continue = 'N'
end
HTH
Jez
August 30, 2007 at 8:52 am
If the table has an identity column that corresponds roughly to the age of the data, find the minimum ID, and the lowest ID you want to keep. Then use a while loop to delete a range of 10,000 rows at a time, beginning with the oldest ID. You should re-index the table when you are done.
Declare @StartID int, @EndID int, @LastID int
Select @StartID = min(ID) from MyTable
Select @LastID = min(ID) from MyTable where MyDate >= dateadd(dd,datediff(dd,0,getdate()-60),0)
While 1=1
Begin
Set @EndID = case when @StartID+9999 > @LastID-1 then @LastID-1 else @StartID+9999 end
Delete from MyTable where ID between @StartID and @EndID
Set @StartID = @StartID+10000
If @StartID >= @EndID break
End
August 30, 2007 at 9:32 am
Thanks gang, you've got me off to a great start!
[font="Verdana"]-- Wally
-- Reluctant DBA[/font]
August 30, 2007 at 1:31 pm
Also, if you can pull it off - drop any and all indexes you can during the delete process, then recreate them when you're done. things will run quite a bit faster if you can afford it.
(not including the clustered index - DON't mess with that).
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
August 30, 2007 at 3:20 pm
Yup, probably can't do that though... There are only a few on the table and they are highly utilized... 
Thanks All!
[font="Verdana"]-- Wally
-- Reluctant DBA[/font]
Viewing 8 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply