SQL Server - Issues with Update When Summing Source + Target Columns in Batches

  • Hi guys,

    I am trying to update a large table in small batches. Within the update statement, I am setting a target field equal to the sum of its' previous value and a value in the source table (in bold below).  This works fine in a single query, but if I split the query into smaller batches, things appear to enter an infinite spin.  I have produced some sample code below which illustrates the issue in the #updateOutput table:

    ***************************************************************************************

    -- Create test tables and insert values

    drop table if exists a

    drop table if exists b

    create table a (var1 nvarchar(10),var2 nvarchar(10),var3 int,amount int)

    create table b (var1 nvarchar(10),var2 nvarchar(10),var3 int,amount int)

    insert into a

    values ('abcd','efgh',4,1),('abcd','ghij',2,8)

    insert into b

    values ('abcd','ghij',1,16),('abcd','efgh',8,32)

    -- Create table for update output

    drop table if EXISTS #updateOutput

    create table dbo.#updateOutput (

       nVarChar(32)

    ,action    nvarchar(10)

    ,id        nVarChar(50)

    ,id2       nvarchar(50)

    ,id3       nvarchar(50)

    ,[updates] nVarChar(500))

    -- Perform update on 2 records in batches of 1

    declare @rows INT, @count INT;

    SET @rows = 1; -- Number of Records to Load Per Batch

    SET @count = 0;

    WHILE @rows > 0

    BEGIN

    BEGIN TRAN

    update top (@rows) a

    set amount = source.amount + target.amount

    -- Write history table of inserts and any updated attributes that changed

    OUTPUT 'CLAIMS'

    ,'UPDATE'

    ,SOURCE.var1

    ,source.var2

    ,source.var3

    ,case

    when cast(inserted.amount as int) <> cast(deleted.amount as int) then

    cast(concat('AMOUNT Old: ',deleted.amount,' New: ',inserted.amount,' ') as nVarChar(max))

    else ''

    end

    --into #updateOutput

    from a target INNER JOIN

    b source

    on source.var1 = target.var1

    and source.var2 = target.var2

    and source.amount <> target.amount

    -- Increment batch counters

    SET @rows = @@ROWCOUNT

    SET @count = @count + @rows

    COMMIT TRAN

    END

    ***************************************************************************************

    It is almost as though summing the source and target fields together are interfering with the @@ROWCOUNT variable.

    If someone has a solution to this, I would be very appreciative

    Cheers,

    Simon

  • The issue is with your UPDATE.

    So lets run through this code logically.

    Make the objects - no problem here

    Set variables - no problem here

    loop - "while @rows > 0" ok, lest skip down and check... @rows is updated by @@ROWCOUNT.  So as long as data is changing, we loop.

    Next the update... THAT is busted.  you are updating the top(@rows) rows of the table with no WHERE clause.  Iteration 1 of the loop, we update the top(1) rows.  so 1 row affected, @rows remains at 1 and @count goes up by 1 as @rows is 1.  Then we loop, and UPDATE the top(@rows) rows of data again with no WHERE clause, so we update the first row that comes back (as @rows is still 1). And repeat indefinitely.

    What you NEED to do is identify a row as being UPDATED and exclude that from your UPDATE statement with a WHERE clause.  You are not telling SQL that the row has been updated but telling it to update the row again and again and again.  And this will continue until summing the values results in the INT value being too large.

    TL;DR - you need to update a row once and then stop updating that row.  Things are not "breaking' due to you trying to do this in batches, it is breaking because you are updating the same row repeatedly.  You need a WHERE clause.

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • Hi Brian,

    Thank you so much for your learned reply.  This makes perfect sense to me and whilst reading your response, it clicked that all variables except the sum are being compared in the concatenated string but the sum will never match as it is always lagging by 1 iteration.  I will find a way to implement a check such that the condition is only satisfied once per record in the target set.

    Thanks again mate and have a great day 🙂

    Cheers

    Simon

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply