SQL Update Behaviour

  • declare @tempTable1 table

    (

    ID INT,

    DATA1 VARCHAR(20),

    DATA2 VARCHAR(20) ,

    TM_DATE datetime

    )

    insert into @tempTable1

    (ID,DATA1) values (1,'test data 1')

    insert into @tempTable1

    (ID,DATA2) values (1,'test data 2')

    insert into @tempTable1

    (ID,TM_DATE) values (1,getdate())

    select * from @tempTable1

    declare @tempTable2 table

    (

    ID INT ,

    DATA1 VARCHAR(20),

    DATA2 VARCHAR(20) ,

    TM_DATE datetime

    )

    insert into @tempTable2

    (ID) values (1)

    select * from @tempTable2

    UPDATE T2 SET

    DATA1 = COALESCE (T1.DATA1,T2.DATA1),

    DATA2 = COALESCE (T1.DATA2,T2.DATA2),

    TM_DATE = COALESCE (T1.TM_DATE,T2.TM_DATE)

    FROM

    @tempTable1 T1

    INNER JOIN

    @tempTable2 T2

    ON T2.ID = T1.ID

    select * from @tempTable2

    Once the update is completed @temptable2 had the following results

    1 test data 1 test data 2 2011-07-07 09:02:25.277

    Now set ID as primary key on the @temptable2 the update has the following results

    1 test data 1 NULL NULL

    Some please explain why this behaviour on update?

  • This is a perfect example of why you should not use UPDATE...FROM. Since there is more than one value for each row being updated, the value will be updated with all the values, leaving the one that was done last as the value that persists. If you change the indexing on the table, then the value that is updated last may be different, as you are seing. This is known as a cardinality error, and the big danger is that nothing is reported when you use UPDATE...FROM. You should use the MERGE statement instead. It's difficult to grapple with the syntax, but worth the effort for avoiding this issue.

    John

  • Thanks John,

    " If you change the indexing on the table, then the value that is updated last may be different, as you are seing",

    If you analyse a bit more details it is not only the order of update is different, but also inserts null values to the updating table which contradict with the COALESCE statement.

    My point is, it does not matter order of last update if the primary key is not set on @temptable2 will always have the value.

    1test data 1test data 22011-07-07 11:56:46.377

  • The 'order of update' you refer to is a consequence of physical implementation details that you cannot rely on. Yes, it is true that in one scenario you 'always' see one behaviour, and in the other scenario, something different happens.

    No surprises there: the result of this type of update is undefined, logically. Also, nothing contradicts the COALESCE - the result of COALESCE can be NULL if all values provided to it are NULL. Depending on the 'order' of operations, the final value in T2 can indeed be NULL.

    As John pointed out, be careful when using UPDATE...FROM to ensure that each destination row is only touched once. The MERGE statement provides protection for this (along with other guarantees). Nevertheless, there is nothing wrong with UPDATE...FROM syntax, used correctly.

  • SQLkiwi (7/7/2011)


    Nevertheless, there is nothing wrong with UPDATE...FROM syntax, used correctly.

    ... unless you're a Joe Celko fan ๐Ÿ˜‰

    -- Gianluca Sartori

  • I actually agree with Joe on this, although not with the argument that it should be avoided because it's not standard SQL. That isn't a concern of mine - if proprietary code does the job better and just as safely, then there's no reason not to use it. However, in this case, I think it's just too dangerous. Even if you use UPDATE FROM only in cases where your updated rows are touched only once, what happens when your data or your database logic changes, but the code does not get updated accordingly? It's much better to familiarise oneself with the MERGE statement and use it every time. That way, there'll be no nasty surprises in the future.

    John

  • John Mitchell-245523 (7/7/2011)


    I actually agree with Joe on this, although not with the argument that it should be avoided because it's not standard SQL. That isn't a concern of mine - if proprietary code does the job better and just as safely, then there's no reason not to use it. However, in this case, I think it's just too dangerous. Even if you use UPDATE FROM only in cases where your updated rows are touched only once, what happens when your data or your database logic changes, but the code does not get updated accordingly? It's much better to familiarise oneself with the MERGE statement and use it every time. That way, there'll be no nasty surprises in the future.

    John

    Actually, I tend to use what does the job, regardless of whether it is standard or not.

    Unfortunately, I have to work a lot with SQL2005, and there's no MERGE support there. :crying:

    -- Gianluca Sartori

  • I prefer to use standard SQL if everything else is equal. I'm not so bothered about whether it's portable, but it's easier to maintain and less likely to be deprecated in a future edition.

    I didn't realise that MERGE wasn't available in 2005 - I guess we're stuck with UPDATE FROM there!

    John

  • John Mitchell-245523 (7/7/2011)


    ...and less likely to be deprecated in a future edition.

    Hi John,

    There was an interesting discussion on Connect over Hugo Kornelis' request to deprecate UPDATE...FROM:

    http://connect.microsoft.com/SQLServer/feedback/details/332437/deprecate-update-from-and-delete-from

  • I don't know if anyone has actually answered the question about why the two results are different...

    With the PRIMARY KEY defined, the query plan shows a SORT DISTINCT which results in 1 row from the join of the two tables, and seemingly this result contains NULLS.

    Without the PRIMARY KEY, you see a Stream Aggregate instead using the SCALAR OPERATOR "ANY" which seems to be filtering out the NULLS.

    I don't know enough to say why this happens though.

    MM



    select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);

  • Forum Etiquette: How to post Reporting Services problems
  • [/url]
  • Forum Etiquette: How to post data/code on a forum to get the best help - by Jeff Moden
  • [/url]
  • How to Post Performance Problems - by Gail Shaw
  • [/url]

  • mister.magoo (7/7/2011)


    With the PRIMARY KEY defined, the query plan shows a SORT DISTINCT which results in 1 row from the join of the two tables, and seemingly this result contains NULLS.

    Without the PRIMARY KEY, you see a Stream Aggregate instead using the SCALAR OPERATOR "ANY" which seems to be filtering out the NULLS.

    I deliberately avoided this discussion because the behaviour is undefined. Any patterns you might see are co-incidental, not guaranteed, and may cause low blood pressure (for all we know).

    ANY does not filter NULLs. Use NULLs in table 1 to see this. The optimizer has many physical plan choice for a given logical query, any arrangement that matches the semantic of the original query is fine. The implementation of the physical operators will often cause apparently repeatable differences in query plans that otherwise produce non-deterministic results, such as the one given here.

    By co-incidence, I wrote about Sort Distinct and the ANY aggregate just a few days ago:

    http://sqlblog.com/blogs/paul_white/archive/2011/07/02/undocumented-query-plans-the-any-aggregate.aspx

  • John Mitchell-245523 (7/7/2011)


    ... the value will be updated with all the values, leaving the one that was done last as the value that persists...

    From BOL UPDATE section: "...This is because a single UPDATE statement never updates the same row two times..."

    โ€œWrite the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.โ€ - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Point taken, but however the database engine handles it internally, it's the physically last value that persists. That's what I was trying to get across.

    John

  • SQLkiwi (7/7/2011)


    mister.magoo (7/7/2011)


    With the PRIMARY KEY defined, the query plan shows a SORT DISTINCT which results in 1 row from the join of the two tables, and seemingly this result contains NULLS.

    Without the PRIMARY KEY, you see a Stream Aggregate instead using the SCALAR OPERATOR "ANY" which seems to be filtering out the NULLS.

    I deliberately avoided this discussion because the behaviour is undefined. Any patterns you might see are co-incidental, not guaranteed, and may cause low blood pressure (for all we know).

    ANY does not filter NULLs. Use NULLs in table 1 to see this. The optimizer has many physical plan choice for a given logical query, any arrangement that matches the semantic of the original query is fine. The implementation of the physical operators will often cause apparently repeatable differences in query plans that otherwise produce non-deterministic results, such as the one given here.

    By co-incidence, I wrote about Sort Distinct and the ANY aggregate just a few days ago:

    http://sqlblog.com/blogs/paul_white/archive/2011/07/02/undocumented-query-plans-the-any-aggregate.aspx

    Thanks very much for that corrective information, Paul. I appreciate it.

    MM



    select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);

  • Forum Etiquette: How to post Reporting Services problems
  • [/url]
  • Forum Etiquette: How to post data/code on a forum to get the best help - by Jeff Moden
  • [/url]
  • How to Post Performance Problems - by Gail Shaw
  • [/url]

  • Thanks Guys, appreciate all your help.

  • Viewing 15 posts - 1 through 15 (of 16 total)

    You must be logged in to reply to this topic. Login to reply