SQL Update Behaviour

Question

SQL Update Behaviour

dhanik.girija

SSC Rookie

Points: 25
More actions
July 7, 2011 at 4:18 am

#254192

declare @tempTable1 table
(
ID INT,
DATA1 VARCHAR(20),
DATA2 VARCHAR(20) ,
TM_DATE datetime
)
insert into @tempTable1
(ID,DATA1) values (1,'test data 1')
insert into @tempTable1
(ID,DATA2) values (1,'test data 2')
insert into @tempTable1
(ID,TM_DATE) values (1,getdate())
select * from @tempTable1
declare @tempTable2 table
(
ID INT ,
DATA1 VARCHAR(20),
DATA2 VARCHAR(20) ,
TM_DATE datetime
)
insert into @tempTable2
(ID) values (1)
select * from @tempTable2
UPDATE T2 SET
DATA1 = COALESCE (T1.DATA1,T2.DATA1),
DATA2 = COALESCE (T1.DATA2,T2.DATA2),
TM_DATE = COALESCE (T1.TM_DATE,T2.TM_DATE)
FROM
@tempTable1 T1
INNER JOIN
@tempTable2 T2
ON T2.ID = T1.ID
select * from @tempTable2
Once the update is completed @temptable2 had the following results
1 test data 1 test data 2 2011-07-07 09:02:25.277
Now set ID as primary key on the @temptable2 the update has the following results
1 test data 1 NULL NULL
Some please explain why this behaviour on update?
Forum Etiquette: How to post Reporting Services problems

[/url]

Forum Etiquette: How to post data/code on a forum to get the best help - by Jeff Moden
[/url]

How to Post Performance Problems - by Gail Shaw
[/url]

John Mitchell-245523 SSC Guru Points: 148809 More actions · Answer 1

This is a perfect example of why you should not use UPDATE...FROM. Since there is more than one value for each row being updated, the value will be updated with all the values, leaving the one that was done last as the value that persists. If you change the indexing on the table, then the value that is updated last may be different, as you are seing. This is known as a cardinality error, and the big danger is that nothing is reported when you use UPDATE...FROM. You should use the MERGE statement instead. It's difficult to grapple with the syntax, but worth the effort for avoiding this issue.

John

dhanik.girija SSC Rookie Points: 25 More actions · Answer 2

Thanks John,

" If you change the indexing on the table, then the value that is updated last may be different, as you are seing",

If you analyse a bit more details it is not only the order of update is different, but also inserts null values to the updating table which contradict with the COALESCE statement.

My point is, it does not matter order of last update if the primary key is not set on @temptable2 will always have the value.

1test data 1test data 22011-07-07 11:56:46.377

Paul White SSC Guru Points: 150468 More actions · Answer 3

The 'order of update' you refer to is a consequence of physical implementation details that you cannot rely on. Yes, it is true that in one scenario you 'always' see one behaviour, and in the other scenario, something different happens.

No surprises there: the result of this type of update is undefined, logically. Also, nothing contradicts the COALESCE - the result of COALESCE can be NULL if all values provided to it are NULL. Depending on the 'order' of operations, the final value in T2 can indeed be NULL.

As John pointed out, be careful when using UPDATE...FROM to ensure that each destination row is only touched once. The MERGE statement provides protection for this (along with other guarantees). Nevertheless, there is nothing wrong with UPDATE...FROM syntax, used correctly.

Paul White
All articles available on SQL.kiwi
@SQL_Kiwi

spaghettidba SSC Guru Points: 105732 More actions · Answer 4

SQLkiwi (7/7/2011)
Nevertheless, there is nothing wrong with UPDATE...FROM syntax, used correctly.

... unless you're a Joe Celko fan 😉

-- Gianluca Sartori

John Mitchell-245523 SSC Guru Points: 148809 More actions · Answer 5

I actually agree with Joe on this, although not with the argument that it should be avoided because it's not standard SQL. That isn't a concern of mine - if proprietary code does the job better and just as safely, then there's no reason not to use it. However, in this case, I think it's just too dangerous. Even if you use UPDATE FROM only in cases where your updated rows are touched only once, what happens when your data or your database logic changes, but the code does not get updated accordingly? It's much better to familiarise oneself with the MERGE statement and use it every time. That way, there'll be no nasty surprises in the future.

John

spaghettidba SSC Guru Points: 105732 More actions · Answer 6

John Mitchell-245523 (7/7/2011)
I actually agree with Joe on this, although not with the argument that it should be avoided because it's not standard SQL. That isn't a concern of mine - if proprietary code does the job better and just as safely, then there's no reason not to use it. However, in this case, I think it's just too dangerous. Even if you use UPDATE FROM only in cases where your updated rows are touched only once, what happens when your data or your database logic changes, but the code does not get updated accordingly? It's much better to familiarise oneself with the MERGE statement and use it every time. That way, there'll be no nasty surprises in the future.
John

Actually, I tend to use what does the job, regardless of whether it is standard or not.

Unfortunately, I have to work a lot with SQL2005, and there's no MERGE support there. :crying:

-- Gianluca Sartori

John Mitchell-245523 SSC Guru Points: 148809 More actions · Answer 7

I prefer to use standard SQL if everything else is equal. I'm not so bothered about whether it's portable, but it's easier to maintain and less likely to be deprecated in a future edition.

I didn't realise that MERGE wasn't available in 2005 - I guess we're stuck with UPDATE FROM there!

John

Paul White SSC Guru Points: 150468 More actions · Answer 8

John Mitchell-245523 (7/7/2011)
...and less likely to be deprecated in a future edition.

Hi John,

There was an interesting discussion on Connect over Hugo Kornelis' request to deprecate UPDATE...FROM:

http://connect.microsoft.com/SQLServer/feedback/details/332437/deprecate-update-from-and-delete-from

Paul White
All articles available on SQL.kiwi
@SQL_Kiwi

mister.magoo SSC-Forever Points: 47174 More actions · Answer 9

I don't know if anyone has actually answered the question about why the two results are different...

With the PRIMARY KEY defined, the query plan shows a SORT DISTINCT which results in 1 row from the join of the two tables, and seemingly this result contains NULLS.

Without the PRIMARY KEY, you see a Stream Aggregate instead using the SCALAR OPERATOR "ANY" which seems to be filtering out the NULLS.

I don't know enough to say why this happens though.

MM

select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);

Paul White SSC Guru Points: 150468 More actions · Answer 10

mister.magoo (7/7/2011)
With the PRIMARY KEY defined, the query plan shows a SORT DISTINCT which results in 1 row from the join of the two tables, and seemingly this result contains NULLS.
Without the PRIMARY KEY, you see a Stream Aggregate instead using the SCALAR OPERATOR "ANY" which seems to be filtering out the NULLS.

I deliberately avoided this discussion because the behaviour is undefined. Any patterns you might see are co-incidental, not guaranteed, and may cause low blood pressure (for all we know).

ANY does not filter NULLs. Use NULLs in table 1 to see this. The optimizer has many physical plan choice for a given logical query, any arrangement that matches the semantic of the original query is fine. The implementation of the physical operators will often cause apparently repeatable differences in query plans that otherwise produce non-deterministic results, such as the one given here.

By co-incidence, I wrote about Sort Distinct and the ANY aggregate just a few days ago:

http://sqlblog.com/blogs/paul_white/archive/2011/07/02/undocumented-query-plans-the-any-aggregate.aspx

Paul White
All articles available on SQL.kiwi
@SQL_Kiwi

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 11

John Mitchell-245523 (7/7/2011)
... the value will be updated with all the values, leaving the one that was done last as the value that persists...

From BOL UPDATE section: "...This is because a single UPDATE statement never updates the same row two times..."

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

John Mitchell-245523 SSC Guru Points: 148809 More actions · Answer 12

Point taken, but however the database engine handles it internally, it's the physically last value that persists. That's what I was trying to get across.

John

mister.magoo SSC-Forever Points: 47174 More actions · Answer 13

SQLkiwi (7/7/2011)
mister.magoo (7/7/2011)
With the PRIMARY KEY defined, the query plan shows a SORT DISTINCT which results in 1 row from the join of the two tables, and seemingly this result contains NULLS.
Without the PRIMARY KEY, you see a Stream Aggregate instead using the SCALAR OPERATOR "ANY" which seems to be filtering out the NULLS.
I deliberately avoided this discussion because the behaviour is undefined. Any patterns you might see are co-incidental, not guaranteed, and may cause low blood pressure (for all we know).
ANY does not filter NULLs. Use NULLs in table 1 to see this. The optimizer has many physical plan choice for a given logical query, any arrangement that matches the semantic of the original query is fine. The implementation of the physical operators will often cause apparently repeatable differences in query plans that otherwise produce non-deterministic results, such as the one given here.
By co-incidence, I wrote about Sort Distinct and the ANY aggregate just a few days ago:
http://sqlblog.com/blogs/paul_white/archive/2011/07/02/undocumented-query-plans-the-any-aggregate.aspx

Thanks very much for that corrective information, Paul. I appreciate it.

MM

select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);

dhanik.girija SSC Rookie Points: 25 More actions · Answer 14

dhanik.girija

SSC Rookie

Points: 25

July 7, 2011 at 7:42 am

#1349703

Thanks Guys, appreciate all your help.