SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Deleting batches of rows with TOP


Deleting batches of rows with TOP

Author
Message
david.gerrard-604066
david.gerrard-604066
Valued Member
Valued Member (69 reputation)Valued Member (69 reputation)Valued Member (69 reputation)Valued Member (69 reputation)Valued Member (69 reputation)Valued Member (69 reputation)Valued Member (69 reputation)Valued Member (69 reputation)

Group: General Forum Members
Points: 69 Visits: 63
I thought the same... "Why would I ever need to delete 5,000,000 rows from a huge database table?" But then I thought "you never know!" It's not the sort of thing you would do every day (I hope) but it's more likely you'd need to do it in response to some disaster or other. And knowing a way to do so effectively without making the situation worse might save your skin one day.
Ben Thul
Ben Thul
SSC-Enthusiastic
SSC-Enthusiastic (172 reputation)SSC-Enthusiastic (172 reputation)SSC-Enthusiastic (172 reputation)SSC-Enthusiastic (172 reputation)SSC-Enthusiastic (172 reputation)SSC-Enthusiastic (172 reputation)SSC-Enthusiastic (172 reputation)SSC-Enthusiastic (172 reputation)

Group: General Forum Members
Points: 172 Visits: 469
I like the following a bit better as you don't have to find out how many rows you're going to affect before affecting them. YMMV.
declare @rowcount int
set @rowcount = 1

declare @batchsize int
set @batchsize = 5000

while(@rowcount > 0)
begin
delete top @batchsize from table where foo=1
set @rowcount = @@rowcount
end


cjeremy74
cjeremy74
Old Hand
Old Hand (361 reputation)Old Hand (361 reputation)Old Hand (361 reputation)Old Hand (361 reputation)Old Hand (361 reputation)Old Hand (361 reputation)Old Hand (361 reputation)Old Hand (361 reputation)

Group: General Forum Members
Points: 361 Visits: 346
Anirban Paul (7/3/2008)
Is these one practically useful? I have a doubt. Any comments on these?


I actually did use this about a month ago... we had a new policy implemented that allowed us to delete a couple of years worth of data out of several audit tables. The largest delete was approx 2.5 million rows. Agreed probably not an every day use, but does get us less experienced DBA's thinking beyond getting something to just work.
w.durkin@online.de
w.durkin@online.de
Ten Centuries
Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)

Group: General Forum Members
Points: 1069 Visits: 1879
On top of these one off deletions there will undoubtedly be cases when such a delete needs to be done. Off the top of my head, I would say a logging table may well need this type of delete to be done.

The original programmer of the TOP option in t-sql had a reason back then, that was a good enough reason to allow this behaviour Tongue

EDIT: I find the TOP option good when doing updates on production tables which can otherwise cause blocking. It's not always nice, but this is real life and not a walt disney film !

Regards,

WilliamD
Christoph D
Christoph D
SSC Veteran
SSC Veteran (299 reputation)SSC Veteran (299 reputation)SSC Veteran (299 reputation)SSC Veteran (299 reputation)SSC Veteran (299 reputation)SSC Veteran (299 reputation)SSC Veteran (299 reputation)SSC Veteran (299 reputation)

Group: General Forum Members
Points: 299 Visits: 357
cjeremy74 (7/3/2008)
I actually did use this about a month ago... we had a new policy implemented that allowed us to delete a couple of years worth of data out of several audit tables. The largest delete was approx 2.5 million rows. Agreed probably not an every day use, but does get us less experienced DBA's thinking beyond getting something to just work.


Same with me, we do have a logging table where all the errors are stored. If there is a small bug in the system, there can be aprox. 3000 rows per second. So if you fix the bug in a few minutes there are still tons of rows in the table and there is an insert for sure if you want to delete the rows all at once.

So it is useful but as I wrote a few rows earlier, I do prefer rowcount but it doesn't matter how you solve a problem as long as you can solve it with as less work for the server as possible.
Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (210K reputation)SSC Guru (210K reputation)SSC Guru (210K reputation)SSC Guru (210K reputation)SSC Guru (210K reputation)SSC Guru (210K reputation)SSC Guru (210K reputation)SSC Guru (210K reputation)

Group: General Forum Members
Points: 210289 Visits: 41973
GermanDBA (7/3/2008)
Also, would this not be even better using a numbers table á la Jeff Moden http://www.sqlservercentral.com/articles/TSQL/62867/ ?


Thank you for the reference... but the Tally table isn't a panacea for all looping problems. The loop here is a process control loop that controls how many times a DELETE is used... I don't believe a Tally table could be used here. If someone can think of a way, please post it because THAT would be truly useful. Smile Heck, if you can think of a way, write an article on it! BigGrin

The reasons why someone would want to do a delete like this has been totally missed...

The purpose of doing looped Deletes is usually two fold...
1. Do the deletes and still allow the table to be used.
2. Keep the transaction log small.

Item #2 above can only be accomplished if the DB is in the simple recovery mode or there is some additional code in the loop that truncates the log while it is running.

The big thing is Item #1... how to do the (massive) deletes and still allow the table to be used by others. For that, you have to have a delay in the loop and you can do that using WAITFOR DELAY '00:00:10' where the number 10 is some number of seconds. The loop that's listed in the article (and in the thread), whether ROWCOUNT or TOP is used, is fairly agressive and won't necessarily allow other processes to "get in". It's so agressive that you might as well just do the delete all at once. In order for the loop to actually allow other processes to work, the TOP or ROWCOUNT should be small enough to limit any locking to about a second and then the delay should be long enough to allow most other processes to get in and get out... I would think that something between 3 and 10 seconds would be good enough.

Shifting gears a bit... I'm really torn... I very much encourage folks to write articles on this forum... the simple sharing of ideas is what the forum is all about... and, I very much encourage folks that read them to remember that not everyone who can write code can also write a masterpiece article. But, I do feel that there were a large number of points missed in this article and, perhaps, some things in the code that violate what some would consider to be a best practice (you don't need to count loops to do this).

It was pointed out that "For larger tables, where this technique is most effective, you would obviously use larger values for @N and probably constrain on a date column." What a prime opportunity for several explanations... Why is this technique most effective for larger tables? Why would you probably constrain ona a date column? Why would you want to use larger values for @N?

So, like I said, I'm really torn... I really want people to be encouraged to write without fear of being bashed, but I do have to agree with what some have already said... a bit more thought, a bit more research, and a bit more explanation of the "Why" of "implied points made" would have gone far into turning what should have been a casual script submission into a well written and informative article.

Heh... of course, sometimes it's not actually the article that counts... sometimes the discussions that follow are more useful than the original intent of the article. In that case, this article has more than done it's job. Wink

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
Ted Manasa
Ted Manasa
SSC-Addicted
SSC-Addicted (411 reputation)SSC-Addicted (411 reputation)SSC-Addicted (411 reputation)SSC-Addicted (411 reputation)SSC-Addicted (411 reputation)SSC-Addicted (411 reputation)SSC-Addicted (411 reputation)SSC-Addicted (411 reputation)

Group: General Forum Members
Points: 411 Visits: 299
Thanks for all of your responses.

This method has been helpful to us for purging log tables and history in our data warehouse. I realize that partitioned tables would have been a better way to handle history purges, but we were not able to implement partitioned tables for various reasons. (Maybe Phase II...)

As for why @cnt had to be DECIMAL: Say you had 11 rows to delete, and you set a TOP value of 5. Obviously, you would need 3 iterations of the DELETE to capture the last (1) record. However, when figuring out how many iterations you need, 11/5 = 2.2, which is a DECIMAL (which you use CEILING() on to get the next largest INT). AFAIK, in order to get a decimal result (2.2), the dividend itself must be of type DECIMAL.

---------------------------
|Ted Pin >>
Ted Manasa
Ted Manasa
SSC-Addicted
SSC-Addicted (411 reputation)SSC-Addicted (411 reputation)SSC-Addicted (411 reputation)SSC-Addicted (411 reputation)SSC-Addicted (411 reputation)SSC-Addicted (411 reputation)SSC-Addicted (411 reputation)SSC-Addicted (411 reputation)

Group: General Forum Members
Points: 411 Visits: 299
@Jeff:

Thanks for your feedback. I will try to create a more well thought-out article for my next submission based on the comments in this discussion. I looked around for an article that guides article writing, but couldn't find one. Would a prolific contributor like to write for us? Hehe

---------------------------
|Ted Pin >>
david.gerrard-604066
david.gerrard-604066
Valued Member
Valued Member (69 reputation)Valued Member (69 reputation)Valued Member (69 reputation)Valued Member (69 reputation)Valued Member (69 reputation)Valued Member (69 reputation)Valued Member (69 reputation)Valued Member (69 reputation)

Group: General Forum Members
Points: 69 Visits: 63
Hey- I've learned a whole lot from the discussion - more than I would have done if I'd had to skim read a much longer article.

I'm clearly a much less experienced database person that all you lot (I'm a .Net analyst / developer that seems to have to do more and more database work with every passing day). So short, basic articles that gloss over some of the finer details may not work quite so well for you guys (who already know the basics and more) but they're a big help for me.

(I guess DBAs don't really do "glossing over details" though, eh? That's how millions of rows get deleted by accident! Wink)

I've had a quick look but couldn't see if the articles have a 'level' or 'suitable for...' category attached to them. Maybe that would help?
Ted Manasa
Ted Manasa
SSC-Addicted
SSC-Addicted (411 reputation)SSC-Addicted (411 reputation)SSC-Addicted (411 reputation)SSC-Addicted (411 reputation)SSC-Addicted (411 reputation)SSC-Addicted (411 reputation)SSC-Addicted (411 reputation)SSC-Addicted (411 reputation)

Group: General Forum Members
Points: 411 Visits: 299
david.gerrard (7/3/2008)
Hey- I've learned a whole lot from the discussion - more than I would have done if I'd had to skim read a much longer article.


David, thank you for being the sole clapper in a tough crowd Wink I'm glad this little piece was able to help someone (which was the most I had hoped for). You'll help me get back up on stage for more punishment =D

---------------------------
|Ted Pin >>
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search