October 9, 2020 at 6:52 am
Hi guys,
I am trying to update a large table in small batches. Within the update statement, I am setting a target field equal to the sum of its' previous value and a value in the source table (in bold below). This works fine in a single query, but if I split the query into smaller batches, things appear to enter an infinite spin. I have produced some sample code below which illustrates the issue in the #updateOutput table:
***************************************************************************************
-- Create test tables and insert values
drop table if exists a
drop table if exists b
create table a (var1 nvarchar(10),var2 nvarchar(10),var3 int,amount int)
create table b (var1 nvarchar(10),var2 nvarchar(10),var3 int,amount int)
insert into a
values ('abcd','efgh',4,1),('abcd','ghij',2,8)
insert into b
values ('abcd','ghij',1,16),('abcd','efgh',8,32)
-- Create table for update output
drop table if EXISTS #updateOutput
create table dbo.#updateOutput (
nVarChar(32)
,action nvarchar(10)
,id nVarChar(50)
,id2 nvarchar(50)
,id3 nvarchar(50)
,[updates] nVarChar(500))
-- Perform update on 2 records in batches of 1
declare @rows INT, @count INT;
SET @rows = 1; -- Number of Records to Load Per Batch
SET @count = 0;
WHILE @rows > 0
BEGIN
BEGIN TRAN
update top (@rows) a
set amount = source.amount + target.amount
-- Write history table of inserts and any updated attributes that changed
OUTPUT 'CLAIMS'
,'UPDATE'
,SOURCE.var1
,source.var2
,source.var3
,case
when cast(inserted.amount as int) <> cast(deleted.amount as int) then
cast(concat('AMOUNT Old: ',deleted.amount,' New: ',inserted.amount,' ') as nVarChar(max))
else ''
end
--into #updateOutput
from a target INNER JOIN
b source
on source.var1 = target.var1
and source.var2 = target.var2
and source.amount <> target.amount
-- Increment batch counters
SET @rows = @@ROWCOUNT
SET @count = @count + @rows
COMMIT TRAN
END
***************************************************************************************
It is almost as though summing the source and target fields together are interfering with the @@ROWCOUNT variable.
If someone has a solution to this, I would be very appreciative
Cheers,
Simon
October 9, 2020 at 9:16 pm
The issue is with your UPDATE.
So lets run through this code logically.
Make the objects - no problem here
Set variables - no problem here
loop - "while @rows > 0" ok, lest skip down and check... @rows is updated by @@ROWCOUNT. So as long as data is changing, we loop.
Next the update... THAT is busted. you are updating the top(@rows) rows of the table with no WHERE clause. Iteration 1 of the loop, we update the top(1) rows. so 1 row affected, @rows remains at 1 and @count goes up by 1 as @rows is 1. Then we loop, and UPDATE the top(@rows) rows of data again with no WHERE clause, so we update the first row that comes back (as @rows is still 1). And repeat indefinitely.
What you NEED to do is identify a row as being UPDATED and exclude that from your UPDATE statement with a WHERE clause. You are not telling SQL that the row has been updated but telling it to update the row again and again and again. And this will continue until summing the values results in the INT value being too large.
TL;DR - you need to update a row once and then stop updating that row. Things are not "breaking' due to you trying to do this in batches, it is breaking because you are updating the same row repeatedly. You need a WHERE clause.
The above is all just my opinion on what you should do.
As with all advice you find on a random internet forum - you shouldn't blindly follow it. Always test on a test server to see if there is negative side effects before making changes to live!
I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.
October 9, 2020 at 9:26 pm
Hi Brian,
Thank you so much for your learned reply. This makes perfect sense to me and whilst reading your response, it clicked that all variables except the sum are being compared in the concatenated string but the sum will never match as it is always lagging by 1 iteration. I will find a way to implement a check such that the condition is only satisfied once per record in the target set.
Thanks again mate and have a great day 🙂
Cheers
Simon
Viewing 3 posts - 1 through 2 (of 2 total)
You must be logged in to reply to this topic. Login to reply