March 15, 2004 at 11:50 am
Hi everyone, my question is, if I have an update statement that updates 30+ million rows, is there a way to make this update commit every few thousand rows that it updates? Thanks in advance.
March 15, 2004 at 12:04 pm
Put your update statement in a loop, update only those that fit the criteria, increment your criteria, loop, and then update again.
mom
March 15, 2004 at 2:46 pm
I think that using a loop (especially to iterate through a cursor) might be a good idea here, so I'd agree with mom ... BUT ... you don't want to commit every loop iteration if you're retrieving one row per iteration. That would kill your performance (I could go into painful detail here, but it works better on a whiteboard). I'm pretty sure that mom means the technique where your where clause allows the update of a bunch of rows every loop iteration, so that each commit also handles a bunch of rows. But I thought it would be worth mentioning explicitly, because (ouch!) issuing a commit on each row update can just destroy your throughput.
Cheers,
Chris
March 15, 2004 at 5:53 pm
The technical I often use to avoid log grow too large is using set rowcount. It control exactly how many rows to update.
March 15, 2004 at 9:25 pm
That's a good technique, but it's easiest to use set rowcount on deletes (because you don't have to adjust the SQL statement - just run it repeatedly in a loop until you run out of rows to delete). Remember, Shahgol needs to do updates, and so will need something a little more sophisticated than a "vanilla" set rowcount loop. I assume this is what mom was referring to also - in this case the where clause of the update will probably have to be dynamically modified with each iteration to produce a windowed effect so that every row is updated at least once. Either that, or updating through a cursor, counting the number of iterations, and doing a commit / begin transaction every Nth iteration, with one more cleanup commit at the end.
The "windowed where clause" approach can be more efficient - sometimes - but the "cursor with commits every Nth row" approach is general and can always be used (sometimes the "windowed where clause" can be too complex for mere mortals to implement when the where clause has to be more complex than just identifying a range of integer primary key values).
Cheers,
Chris
March 16, 2004 at 7:43 am
Maybe you can use the Update in combination with an SELECT top 1000
UPDATE
March 16, 2004 at 6:21 pm
The hard thing about this problem is tracking the work you've done and
the work remaining. Ideally you want to be able to track a logical transaction against the large table. I think shagols probably wants to be able to commit 1000 updated rows not what is updated every 1000 rows inspected.
I think one good approach is to snapshot chunks of the desired
table updates into a temporary table using the top function and then
update the table by joining it with the temporary table. This batches the updates size to your preferred 1000 rows at a time and allows for scanning large chunks of the update table under the update constraints on each loop pass.
This does require an amenable pk and of course and an correct
ordered select query plan by pk under the joins for the calculation of new column values so it may cause overhead if the clustering isn't on the pk.
Also I would'nt recommend this if the pks are some how mutating either.
Another thing to think about is avoiding second instances of such an update. So youll probably want some form of incremental updates semaphore to avoid a second instance of the same update running against the table concurrently.
Here's a brief outline of what I mean by the snapshot chunking and updates.
Any comments on this approach would be welcomed.
Peter Evasn (__PETER Peter_)
-- Using the status column in the table approach.
declare @runtimeid BigInt
declare @ROWSUPDATED BigInt
-- Get semaphore
update incrementalupdatejobids
set jobrunning = runtimeid + 1
, runtimeid = runtimeid + 1
from incrementalupdatejobids
where name = "table"
and updatetaskname = "updatetaskname"
and jobrunning is null
select @ROWSUPDATED = @@ROWCOUNT
if (@ROWSUPDATED > 0) -- Good update task specific record
begin
-- Set Iterator range values to track job.
select @runtimeid = runtimeid
from incrementalupdatejobids
where name = "table"
and updatetaskname = "updatetaskname"
and jobrunning is not null
insert
(runtimeid, KEYBOUNDARY)
select @runtimeid, NULL
-- Create TABLE [#rowids w updated values]
-- Loop over range
while (@ROWSUPDATED > 0)
begin
-- Get a calculated snapshot of the first 1000 rows to process.
insert into [#rowids w updated values]
TOP 1000 .[ROWID]
, (desired updated value expression) as COL1
, (desired updated value expression) as COL2
...
, (desired updated value expression) as COLN
from
-- OTHER CONSTRAINT TABLES JOIN GO HERE
-- NOTE MOVE THE [ROWID] constraints into the first join clause
-- to indicate precedence for [ROWID] Filtering
-- NOTE IF (TRUE OR UNKNOWN) IS TRUE
where (
(.[ROWID] > (
select last
from
where runtimeid = @runtimeid))
or
(.[ROWID] >= (
select MIN(.[ROWID])
from
WHERE NOT EXISTS (
select * from
where runtimeid = @runtimeid
)
)
)
)
AND
-- ... NORMAL UPDATE CONSTRAINTS GO HERE
order by .[ROWID]
-- Update the tables values
update
set [updated column 1] = [#rowids w updated values].[COL1]
, [updated column 2] = [#rowids w updated values].[COL2]
, .
, .
, .
FROM
INNER JOIN [#rowids w updated values]
on .[ROWID] = [#rowids w updated values].[ROWID]
select @ROWSUPDATED = @@ROWCOUNT
update
set Last = select MAX(.[ROWID])
delete [#rowids to process plus updated values]
end
-- remove job as running from incremental updates tracking table
update
incrementalupdatejobids
set
jobrunning = NULL
from incrementalupdatejobids
where name = "target table name"
and updatetaskname = "arbitrary update task name"
and jobrunning is NOT null
end
March 16, 2004 at 9:10 pm
The most interesting thing about this thread - to me - is that all of us who have posted probably understand one or more of the common ways to do this, but the thread originator - shahgols - hasn't responded yet.  Of course, it's only been about a day and a half, but since the original request was not very specific, it would be interesting to know more specifics about his or her very solvable problem.  In the meantime ... nice talking with the rest of you, I guess!  
Viewing 9 posts - 1 through 9 (of 9 total)
You must be logged in to reply to this topic. Login to reply