Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase ««12345»»»

Deleting Large Number of Records Expand / Collapse
Author
Message
Posted Tuesday, September 15, 2009 8:24 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Monday, October 22, 2012 6:46 AM
Points: 1, Visits: 12
I use this ( this is home cooked)

create proc temp_slowDelete
(
@table varchar(255),
@where varchar(1024)
)
as

/* Slow transaction friendly delete */
declare @ParmDefinition nvarchar(1024)
declare @deleteStatment nvarchar(2048)
declare @statment nvarchar(2048)
declare @total int
declare @rowcount int

Select @deleteStatment = ''delete from '' + @table + '' where '' + @where

set rowcount 250000
set nocount on

RAISERROR (''Counting rows to delete...'',0,1) WITH NOWAIT
SET @ParmDefinition = N''@CountOUT int OUTPUT'';
select @statment=''select @CountOUT = count(*) from '' + @table + '' where '' + @where
exec sp_executesql @statment,@ParmDefinition,@CountOUT=@total OUTPUT;
if (@total = 0)
begin
print ''Nothing todo :)''
return
end

RAISERROR (''%d to delete'',0,1,@total) WITH NOWAIT

exec sp_executesql @deleteStatment
set @rowcount = @@ROWCOUNT
while (@ROWCOUNT > 0)
begin
select @total = @total - @ROWCOUNT
RAISERROR (''Deleted %d, %d left'',0,1,@rowcount,@total) WITH NOWAIT
exec sp_executesql @deleteStatment
set @rowcount = @@ROWCOUNT
end'
Post #788190
Posted Tuesday, September 15, 2009 8:27 AM


SSC-Insane

SSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-Insane

Group: General Forum Members
Last Login: Today @ 11:18 AM
Points: 22,492, Visits: 30,188
Ric Sierra (9/15/2009)
Just... keep simple!

If you need to delete a large number of rows and the scenario don't need to keep the database log.

In this case I want to delete all the rows between 2009/01/01 and 2009/10/23, about 100000000 rows.

delete top (10000) from myTable where colDate between '20090101' and '20091023'
backup log myDatabase with no_log
go 10000


What you think?


P.S. The recovery model is SIMPLE.


If your database is using the SIMPLE recovery model you can't do BACKUP LOG, nor is it needed. The only other issue I'd have with your simple process is why run it more than needed. Using the GO 10000, your delete statement is going to run 10000 times regardless of how many times it really needs to run.

If your database is using BULK-LOGGED or FULL recovery model, then you do need the transaction log backups in case a recovery is needed. The code I have provided can be incorporated into a stored procedure where periodic deletes of large nmber of records need to be accomplished.



Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
Post #788192
Posted Tuesday, September 15, 2009 8:32 AM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: Administrators
Last Login: Today @ 10:15 AM
Points: 32,779, Visits: 14,938
Keep in mind that even with simple mode, the log can grow. If you don't break it into batches then each delete is logged, which can be a lot of data. It's only when a transaction is committed can the log records be marked as "deleted" during a checkpoint.







Follow me on Twitter: @way0utwest

Forum Etiquette: How to post data/code on a forum to get the best help
Post #788200
Posted Tuesday, September 15, 2009 8:42 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Friday, February 28, 2014 10:05 AM
Points: 8, Visits: 68
Thanks, Lynn. Most timely, since I'm in the middle of deleting something like 15 million rows from a production database that is designated high availability in our organization. Very helpful!
Post #788214
Posted Tuesday, September 15, 2009 8:47 AM
Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Sunday, April 13, 2014 5:06 PM
Points: 1,426, Visits: 318
Lynn Pettis (9/15/2009)
Ric Sierra (9/15/2009)
Just... keep simple!

If you need to delete a large number of rows and the scenario don't need to keep the database log.

In this case I want to delete all the rows between 2009/01/01 and 2009/10/23, about 100000000 rows.

delete top (10000) from myTable where colDate between '20090101' and '20091023'
backup log myDatabase with no_log
go 10000


What you think?


P.S. The recovery model is SIMPLE.


If your database is using the SIMPLE recovery model you can't do BACKUP LOG, nor is it needed. The only other issue I'd have with your simple process is why run it more than needed. Using the GO 10000, your delete statement is going to run 10000 times regardless of how many times it really needs to run.

If your database is using BULK-LOGGED or FULL recovery model, then you do need the transaction log backups in case a recovery is needed. The code I have provided can be incorporated into a stored procedure where periodic deletes of large nmber of records need to be accomplished.


In this scenario I need to delete 100,000,000 rows, by each batch I delete 10,000 rows but to ensure to delete all the rows I need to run 10,000 times the batch.
10,000 x 10,000 = 100,000,000
Post #788220
Posted Tuesday, September 15, 2009 9:26 AM


SSC-Insane

SSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-Insane

Group: General Forum Members
Last Login: Today @ 11:18 AM
Points: 22,492, Visits: 30,188
Ric Sierra (9/15/2009)
Lynn Pettis (9/15/2009)
Ric Sierra (9/15/2009)
Just... keep simple!

If you need to delete a large number of rows and the scenario don't need to keep the database log.

In this case I want to delete all the rows between 2009/01/01 and 2009/10/23, about 100000000 rows.

delete top (10000) from myTable where colDate between '20090101' and '20091023'
backup log myDatabase with no_log
go 10000


What you think?


P.S. The recovery model is SIMPLE.


If your database is using the SIMPLE recovery model you can't do BACKUP LOG, nor is it needed. The only other issue I'd have with your simple process is why run it more than needed. Using the GO 10000, your delete statement is going to run 10000 times regardless of how many times it really needs to run.

If your database is using BULK-LOGGED or FULL recovery model, then you do need the transaction log backups in case a recovery is needed. The code I have provided can be incorporated into a stored procedure where periodic deletes of large nmber of records need to be accomplished.


In this scenario I need to delete 100,000,000 rows, by each batch I delete 10,000 rows but to ensure to delete all the rows I need to run 10,000 times the batch.
10,000 x 10,000 = 100,000,000


That was not obvious based on your original post. There was nothing there to indicate how many rows were to be deleted. Please remember, my code makes no determination regarding the total number of rows to be deleted. The delete in the example code is based on a data range, and if you look at the number of records generated in the sample dataset, there is no specific number of records for each year, it is rendomly generated which is much more likely in a real world environment. You aren't always going to be deleting a set number of rows each time.



Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
Post #788270
Posted Tuesday, September 15, 2009 9:30 AM


Mr or Mrs. 500

Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500

Group: General Forum Members
Last Login: Monday, July 22, 2013 11:59 AM
Points: 509, Visits: 719
Lynn Pettis (9/15/2009)
That was not obvious based on your original post. There was nothing there to indicate how many rows were to be deleted.

He did say there was a set number of known rows...

Aside from that though, thoroughly agree - there's not even that much point in counting the rows first - you might as well just get on with it...


Atlantis Interactive - SQL Server Tools
My blog
Why I wrote a sql query analyzer clone
Post #788274
Posted Tuesday, September 15, 2009 9:42 AM
Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Sunday, April 13, 2014 5:06 PM
Points: 1,426, Visits: 318
Lynn Pettis (9/15/2009)
Ric Sierra (9/15/2009)
Lynn Pettis (9/15/2009)
Ric Sierra (9/15/2009)
Just... keep simple!

If you need to delete a large number of rows and the scenario don't need to keep the database log.

In this case I want to delete all the rows between 2009/01/01 and 2009/10/23, about 100000000 rows.

delete top (10000) from myTable where colDate between '20090101' and '20091023'
backup log myDatabase with no_log
go 10000


What you think?


P.S. The recovery model is SIMPLE.


If your database is using the SIMPLE recovery model you can't do BACKUP LOG, nor is it needed. The only other issue I'd have with your simple process is why run it more than needed. Using the GO 10000, your delete statement is going to run 10000 times regardless of how many times it really needs to run.

If your database is using BULK-LOGGED or FULL recovery model, then you do need the transaction log backups in case a recovery is needed. The code I have provided can be incorporated into a stored procedure where periodic deletes of large nmber of records need to be accomplished.


In this scenario I need to delete 100,000,000 rows, by each batch I delete 10,000 rows but to ensure to delete all the rows I need to run 10,000 times the batch.
10,000 x 10,000 = 100,000,000


That was not obvious based on your original post. There was nothing there to indicate how many rows were to be deleted. Please remember, my code makes no determination regarding the total number of rows to be deleted. The delete in the example code is based on a data range, and if you look at the number of records generated in the sample dataset, there is no specific number of records for each year, it is rendomly generated which is much more likely in a real world environment. You aren't always going to be deleting a set number of rows each time.


I agree with your code, my POV is only an alternative (a fast track) to do something similar.
Post #788284
Posted Tuesday, September 15, 2009 9:50 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Friday, April 04, 2014 5:58 AM
Points: 1, Visits: 26
I just had to deal with this. However, I need to have an hourly sliding window where I delete records older than 3 hours for working with .net viewstate.

I had the transaction log in simple mode and was deleting in small batches. However, the deletes were not keeping up with the inserts. Since it is viewstate the records are very big.

What I did was set up a partitioned view and just truncate the table in the view thereby deleting 200,000 records in less than a second and not affect the inserts.

IMHO, truncating partitions or tables in a partitioned view is much cleaner and works faster than this method.
Post #788298
Posted Tuesday, September 15, 2009 10:17 AM


SSC-Insane

SSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-Insane

Group: General Forum Members
Last Login: Today @ 11:18 AM
Points: 22,492, Visits: 30,188
robblackin (9/15/2009)
I just had to deal with this. However, I need to have an hourly sliding window where I delete records older than 3 hours for working with .net viewstate.

I had the transaction log in simple mode and was deleting in small batches. However, the deletes were not keeping up with the inserts. Since it is viewstate the records are very big.

What I did was set up a partitioned view and just truncate the table in the view thereby deleting 200,000 records in less than a second and not affect the inserts.

IMHO, truncating partitions or tables in a partitioned view is much cleaner and works faster than this method.


If you are in a position to use paritioned tables in SQL Server 2005/2008, then yes I agree. Unfortunately, this may not be possibile for several reasons. One, you may be using SQL Server 2005/2008 Standard Edition where partitioned tables aren't available. Or, your data may not be setup using partitioned views (and therefore separate tables). If that is the case, then your option isn't available. I am just providing an option. As with all things in the SQL Server world, it depends. This is just one option available to people to use when in this position.



Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
Post #788340
« Prev Topic | Next Topic »

Add to briefcase ««12345»»»

Permissions Expand / Collapse