Click here to monitor SSC
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Deleting Large Number of Records


Deleting Large Number of Records

Author
Message
STEVE-449765
STEVE-449765
Forum Newbie
Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)

Group: General Forum Members
Points: 1 Visits: 12
I use this ( this is home cooked)

create proc temp_slowDelete
(
@table varchar(255),
@where varchar(1024)
)
as

/* Slow transaction friendly delete */
declare @ParmDefinition nvarchar(1024)
declare @deleteStatment nvarchar(2048)
declare @statment nvarchar(2048)
declare @total int
declare @rowcount int

Select @deleteStatment = ''delete from '' + @table + '' where '' + @where

set rowcount 250000
set nocount on

RAISERROR (''Counting rows to delete...'',0,1) WITH NOWAIT
SET @ParmDefinition = N''@CountOUT int OUTPUT'';
select @statment=''select @CountOUT = count(*) from '' + @table + '' where '' + @where
exec sp_executesql @statment,@ParmDefinition,@CountOUT=@total OUTPUT;
if (@total = 0)
begin
print ''Nothing todo Smile''
return
end

RAISERROR (''%d to delete'',0,1,@total) WITH NOWAIT

exec sp_executesql @deleteStatment
set @rowcount = @@ROWCOUNT
while (@ROWCOUNT > 0)
begin
select @total = @total - @ROWCOUNT
RAISERROR (''Deleted %d, %d left'',0,1,@rowcount,@total) WITH NOWAIT
exec sp_executesql @deleteStatment
set @rowcount = @@ROWCOUNT
end'
Lynn Pettis
Lynn Pettis
One Orange Chip
One Orange Chip (26K reputation)One Orange Chip (26K reputation)One Orange Chip (26K reputation)One Orange Chip (26K reputation)One Orange Chip (26K reputation)One Orange Chip (26K reputation)One Orange Chip (26K reputation)One Orange Chip (26K reputation)

Group: General Forum Members
Points: 26450 Visits: 38121
Ric Sierra (9/15/2009)
Just... keep simple!

If you need to delete a large number of rows and the scenario don't need to keep the database log.

In this case I want to delete all the rows between 2009/01/01 and 2009/10/23, about 100000000 rows.

delete top (10000) from myTable where colDate between '20090101' and '20091023'
backup log myDatabase with no_log
go 10000


What you think?


P.S. The recovery model is SIMPLE.


If your database is using the SIMPLE recovery model you can't do BACKUP LOG, nor is it needed. The only other issue I'd have with your simple process is why run it more than needed. Using the GO 10000, your delete statement is going to run 10000 times regardless of how many times it really needs to run.

If your database is using BULK-LOGGED or FULL recovery model, then you do need the transaction log backups in case a recovery is needed. The code I have provided can be incorporated into a stored procedure where periodic deletes of large nmber of records need to be accomplished.

Cool
Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
Steve Jones
Steve Jones
SSC-Forever
SSC-Forever (40K reputation)SSC-Forever (40K reputation)SSC-Forever (40K reputation)SSC-Forever (40K reputation)SSC-Forever (40K reputation)SSC-Forever (40K reputation)SSC-Forever (40K reputation)SSC-Forever (40K reputation)

Group: Administrators
Points: 40392 Visits: 18846
Keep in mind that even with simple mode, the log can grow. If you don't break it into batches then each delete is logged, which can be a lot of data. It's only when a transaction is committed can the log records be marked as "deleted" during a checkpoint.

Follow me on Twitter: @way0utwest
Forum Etiquette: How to post data/code on a forum to get the best help
My Blog: www.voiceofthedba.com
lfaught
lfaught
Grasshopper
Grasshopper (12 reputation)Grasshopper (12 reputation)Grasshopper (12 reputation)Grasshopper (12 reputation)Grasshopper (12 reputation)Grasshopper (12 reputation)Grasshopper (12 reputation)Grasshopper (12 reputation)

Group: General Forum Members
Points: 12 Visits: 71
Thanks, Lynn. Most timely, since I'm in the middle of deleting something like 15 million rows from a production database that is designated high availability in our organization. Very helpful!
Ric Sierra
Ric Sierra
UDP Broadcaster
UDP Broadcaster (1.4K reputation)UDP Broadcaster (1.4K reputation)UDP Broadcaster (1.4K reputation)UDP Broadcaster (1.4K reputation)UDP Broadcaster (1.4K reputation)UDP Broadcaster (1.4K reputation)UDP Broadcaster (1.4K reputation)UDP Broadcaster (1.4K reputation)

Group: General Forum Members
Points: 1441 Visits: 342
Lynn Pettis (9/15/2009)
Ric Sierra (9/15/2009)
Just... keep simple!

If you need to delete a large number of rows and the scenario don't need to keep the database log.

In this case I want to delete all the rows between 2009/01/01 and 2009/10/23, about 100000000 rows.

delete top (10000) from myTable where colDate between '20090101' and '20091023'
backup log myDatabase with no_log
go 10000


What you think?


P.S. The recovery model is SIMPLE.


If your database is using the SIMPLE recovery model you can't do BACKUP LOG, nor is it needed. The only other issue I'd have with your simple process is why run it more than needed. Using the GO 10000, your delete statement is going to run 10000 times regardless of how many times it really needs to run.

If your database is using BULK-LOGGED or FULL recovery model, then you do need the transaction log backups in case a recovery is needed. The code I have provided can be incorporated into a stored procedure where periodic deletes of large nmber of records need to be accomplished.


In this scenario I need to delete 100,000,000 rows, by each batch I delete 10,000 rows but to ensure to delete all the rows I need to run 10,000 times the batch.
10,000 x 10,000 = 100,000,000
Lynn Pettis
Lynn Pettis
One Orange Chip
One Orange Chip (26K reputation)One Orange Chip (26K reputation)One Orange Chip (26K reputation)One Orange Chip (26K reputation)One Orange Chip (26K reputation)One Orange Chip (26K reputation)One Orange Chip (26K reputation)One Orange Chip (26K reputation)

Group: General Forum Members
Points: 26450 Visits: 38121
Ric Sierra (9/15/2009)
Lynn Pettis (9/15/2009)
Ric Sierra (9/15/2009)
Just... keep simple!

If you need to delete a large number of rows and the scenario don't need to keep the database log.

In this case I want to delete all the rows between 2009/01/01 and 2009/10/23, about 100000000 rows.

delete top (10000) from myTable where colDate between '20090101' and '20091023'
backup log myDatabase with no_log
go 10000


What you think?


P.S. The recovery model is SIMPLE.


If your database is using the SIMPLE recovery model you can't do BACKUP LOG, nor is it needed. The only other issue I'd have with your simple process is why run it more than needed. Using the GO 10000, your delete statement is going to run 10000 times regardless of how many times it really needs to run.

If your database is using BULK-LOGGED or FULL recovery model, then you do need the transaction log backups in case a recovery is needed. The code I have provided can be incorporated into a stored procedure where periodic deletes of large nmber of records need to be accomplished.


In this scenario I need to delete 100,000,000 rows, by each batch I delete 10,000 rows but to ensure to delete all the rows I need to run 10,000 times the batch.
10,000 x 10,000 = 100,000,000


That was not obvious based on your original post. There was nothing there to indicate how many rows were to be deleted. Please remember, my code makes no determination regarding the total number of rows to be deleted. The delete in the example code is based on a data range, and if you look at the number of records generated in the sample dataset, there is no specific number of records for each year, it is rendomly generated which is much more likely in a real world environment. You aren't always going to be deleting a set number of rows each time.

Cool
Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
Matt Whitfield
Matt Whitfield
Mr or Mrs. 500
Mr or Mrs. 500 (527 reputation)Mr or Mrs. 500 (527 reputation)Mr or Mrs. 500 (527 reputation)Mr or Mrs. 500 (527 reputation)Mr or Mrs. 500 (527 reputation)Mr or Mrs. 500 (527 reputation)Mr or Mrs. 500 (527 reputation)Mr or Mrs. 500 (527 reputation)

Group: General Forum Members
Points: 527 Visits: 719
Lynn Pettis (9/15/2009)
That was not obvious based on your original post. There was nothing there to indicate how many rows were to be deleted.

He did say there was a set number of known rows... :-)

Aside from that though, thoroughly agree - there's not even that much point in counting the rows first - you might as well just get on with it...

Atlantis Interactive - SQL Server Tools
My blog
Why I wrote a sql query analyzer clone
Ric Sierra
Ric Sierra
UDP Broadcaster
UDP Broadcaster (1.4K reputation)UDP Broadcaster (1.4K reputation)UDP Broadcaster (1.4K reputation)UDP Broadcaster (1.4K reputation)UDP Broadcaster (1.4K reputation)UDP Broadcaster (1.4K reputation)UDP Broadcaster (1.4K reputation)UDP Broadcaster (1.4K reputation)

Group: General Forum Members
Points: 1441 Visits: 342
Lynn Pettis (9/15/2009)
Ric Sierra (9/15/2009)
Lynn Pettis (9/15/2009)
Ric Sierra (9/15/2009)
Just... keep simple!

If you need to delete a large number of rows and the scenario don't need to keep the database log.

In this case I want to delete all the rows between 2009/01/01 and 2009/10/23, about 100000000 rows.

delete top (10000) from myTable where colDate between '20090101' and '20091023'
backup log myDatabase with no_log
go 10000


What you think?


P.S. The recovery model is SIMPLE.


If your database is using the SIMPLE recovery model you can't do BACKUP LOG, nor is it needed. The only other issue I'd have with your simple process is why run it more than needed. Using the GO 10000, your delete statement is going to run 10000 times regardless of how many times it really needs to run.

If your database is using BULK-LOGGED or FULL recovery model, then you do need the transaction log backups in case a recovery is needed. The code I have provided can be incorporated into a stored procedure where periodic deletes of large nmber of records need to be accomplished.


In this scenario I need to delete 100,000,000 rows, by each batch I delete 10,000 rows but to ensure to delete all the rows I need to run 10,000 times the batch.
10,000 x 10,000 = 100,000,000


That was not obvious based on your original post. There was nothing there to indicate how many rows were to be deleted. Please remember, my code makes no determination regarding the total number of rows to be deleted. The delete in the example code is based on a data range, and if you look at the number of records generated in the sample dataset, there is no specific number of records for each year, it is rendomly generated which is much more likely in a real world environment. You aren't always going to be deleting a set number of rows each time.


I agree with your code, my POV is only an alternative (a fast track) to do something similar.
robblackin
robblackin
Forum Newbie
Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)

Group: General Forum Members
Points: 1 Visits: 27
I just had to deal with this. However, I need to have an hourly sliding window where I delete records older than 3 hours for working with .net viewstate.

I had the transaction log in simple mode and was deleting in small batches. However, the deletes were not keeping up with the inserts. Since it is viewstate the records are very big.

What I did was set up a partitioned view and just truncate the table in the view thereby deleting 200,000 records in less than a second and not affect the inserts.

IMHO, truncating partitions or tables in a partitioned view is much cleaner and works faster than this method.
Lynn Pettis
Lynn Pettis
One Orange Chip
One Orange Chip (26K reputation)One Orange Chip (26K reputation)One Orange Chip (26K reputation)One Orange Chip (26K reputation)One Orange Chip (26K reputation)One Orange Chip (26K reputation)One Orange Chip (26K reputation)One Orange Chip (26K reputation)

Group: General Forum Members
Points: 26450 Visits: 38121
robblackin (9/15/2009)
I just had to deal with this. However, I need to have an hourly sliding window where I delete records older than 3 hours for working with .net viewstate.

I had the transaction log in simple mode and was deleting in small batches. However, the deletes were not keeping up with the inserts. Since it is viewstate the records are very big.

What I did was set up a partitioned view and just truncate the table in the view thereby deleting 200,000 records in less than a second and not affect the inserts.

IMHO, truncating partitions or tables in a partitioned view is much cleaner and works faster than this method.


If you are in a position to use paritioned tables in SQL Server 2005/2008, then yes I agree. Unfortunately, this may not be possibile for several reasons. One, you may be using SQL Server 2005/2008 Standard Edition where partitioned tables aren't available. Or, your data may not be setup using partitioned views (and therefore separate tables). If that is the case, then your option isn't available. I am just providing an option. As with all things in the SQL Server world, it depends. This is just one option available to people to use when in this position.

Cool
Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search