Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase ««12345»»»

An interesting thing about isnull Expand / Collapse
Author
Message
Posted Wednesday, March 31, 2010 6:49 AM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Today @ 2:50 AM
Points: 11,192, Visits: 11,098
James Lean (3/31/2010)
That would seem to be a very specific case, I wouldn't have thought it was a good reason to make a general case for using COALESCE over ISNULL. Personally I do tend to use COALESCE, even with only two arguments. That way, if you do need to add more arguments in future you don't have to worry about changing it. Unless I do need to take advantage of the implicit CASTing to the first argument's type, in which case I'll use ISNULL. As always, I suppose, "it depends".

Indeed it does depend.

I think it's a shame, in a way, that almost every QotD or thread that mentions ISNULL or COALESCE always attracts a certain element determined to prove that one is always better than the other. (I'm not including you in that, by the way)

My own view is that the two functions are very different, and each has its merits. I use both regularly, depending on the query I am writing. I realise that such a balanced view lacks any sort of excitement factor, but there it is




Paul White
SQL Server MVP
SQLblog.com
@SQL_Kiwi
Post #893715
Posted Wednesday, March 31, 2010 6:55 AM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Tuesday, February 25, 2014 1:16 PM
Points: 29, Visits: 32
lol owned

good question
Post #893726
Posted Wednesday, March 31, 2010 7:02 AM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Today @ 7:00 AM
Points: 5,930, Visits: 8,178
Paul White NZ (3/31/2010)
Hugo Kornelis (3/31/2010)
So if any of the arguments is a subquery, the implementation if COALESCE((subquery), ...) becomes CASE WHEN (subquery) IS NOT NULL THEN (subquery) ... ELSE NULL END.

That reminds me of a CASE construction like the following:
SELECT  CASE CHECKSUM(NEWID()) % 3 + 1
WHEN 1 THEN 'One'
WHEN 2 THEN 'Two'
WHEN 3 THEN 'Three'
ELSE 'WTF?'
END

...which quite frequently hits the ELSE, and that surprises some people.


That is caused by a related and similar reason. This "simple" CASE is also defined in ANSI in terms of the "searched" CASE (and apparently implemented in this way):
SELECT  CASE
WHEN CHECKSUM(NEWID()) % 3 + 1 = 1 THEN 'One'
WHEN CHECKSUM(NEWID()) % 3 + 1 = 2 THEN 'Two'
WHEN CHECKSUM(NEWID()) % 3 + 1 = 3 THEN 'Three'
ELSE 'WTF?'
END;

If you had replaced the CHECKSUM expression with a complex subquery and added another 7 WHEN clauses, the end result would be a query that evaluates the same subquery ten times in a row! <shudder>

And the optimizer that sits a bit further down the line does apparently not realize that the two subqueries are identical, so it happily produces a plan that will evaluate the same subquery twice.

Once expanded by the CASE, is it true to say that they are identical? It seems to me that the results are non-deterministic, so it would not be safe to collapse these evaluations. (One might argue that it would be safe at the SERIALIZABLE isolation level (or one of the row-versioning isolation levels), but the plan would not then be reusable).


I disagree. If the query includes some non-deterministic code (such as NEWID()), there might be room for argument. But if the query in itself is deterministic, than the results should be deterministic. A query is supposed to behave as if it is evaluated at a single moment in time; implementation choices should never influence the results.



Hugo Kornelis, SQL Server MVP
Visit my SQL Server blog: http://sqlblog.com/blogs/hugo_kornelis
Post #893731
Posted Wednesday, March 31, 2010 7:45 AM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Today @ 2:50 AM
Points: 11,192, Visits: 11,098
Hugo Kornelis (3/31/2010)
That is caused by a related and similar reason. This "simple" CASE is also defined in ANSI in terms of the "searched" CASE (and apparently implemented in this way):

Yep, you can see that by examining the query plan produced - the CASE is always expanded to the searched form.

I disagree. If the query includes some non-deterministic code (such as NEWID()), there might be room for argument. But if the query in itself is deterministic, than the results should be deterministic.

I do see your point, but I'm not sure I agree. Once the decision has been taken to expand the CASE, the sub-queries must be re-evaluated. The fact that data access is involved makes the result non-deterministic - the result depends on the state of the database.
We would not expect the following code to be 'collapsed' into a single sub-query execution:
SELECT  result1 = (SELECT COUNT(*) FROM sys.objects),
result2 = (SELECT COUNT(*) FROM sys.objects),
result3 = (SELECT COUNT(*) FROM sys.objects)

A query is supposed to behave as if it is evaluated at a single moment in time; implementation choices should never influence the results.

I think this is the crux of the matter. You are only guaranteed this sort of behaviour when running at SERIALIZABLE or one of the row-versioning isolation levels, as I mentioned before. (Also see the example above)




Paul White
SQL Server MVP
SQLblog.com
@SQL_Kiwi
Post #893764
Posted Wednesday, March 31, 2010 7:56 AM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Today @ 7:00 AM
Points: 5,930, Visits: 8,178
Paul White NZ (3/31/2010)
Hugo Kornelis (3/31/2010)
I disagree. If the query includes some non-deterministic code (such as NEWID()), there might be room for argument. But if the query in itself is deterministic, than the results should be deterministic.

I do see your point, but I'm not sure I agree. Once the decision has been taken to expand the CASE, the sub-queries must be re-evaluated. The fact that data access is involved makes the result non-deterministic - the result depends on the state of the database.
We would not expect the following code to be 'collapsed' into a single sub-query execution:
SELECT  result1 = (SELECT COUNT(*) FROM sys.objects),
result2 = (SELECT COUNT(*) FROM sys.objects),
result3 = (SELECT COUNT(*) FROM sys.objects)


Actually, I would expect exactly that to happen. This is a single query, supposed to return results as if it were executed at a single moment in time. Transaction isolation levels govern what happens if multiple statements are executed in succession. So if I change your code to
SELECT  @result1 = (SELECT COUNT(*) FROM sys.objects);
SELECT @result2 = (SELECT COUNT(*) FROM sys.objects);
SELECT @result3 = (SELECT COUNT(*) FROM sys.objects);

then I would agree that, depending on transaction isolation level and @@TRANCOUNT, the results of the three statements might be different on a busy server. But for your code, with the single SELECT, I would consider anything but three equal values a bug.

And in situations such as the scenario that brought us here, where the actual subquery occurs only once in the submitted query and it's SQL Server itself that expands it, I'd consider it even more buggy!



Hugo Kornelis, SQL Server MVP
Visit my SQL Server blog: http://sqlblog.com/blogs/hugo_kornelis
Post #893777
Posted Wednesday, March 31, 2010 8:14 AM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Today @ 2:50 AM
Points: 11,192, Visits: 11,098
Hugo Kornelis (3/31/2010)
I would agree that, depending on transaction isolation level and @@TRANCOUNT, the results of the three statements might be different on a busy server. But for your code, with the single SELECT, I would consider anything but three equal values a bug.

I don't see why. The COUNT done at the default isolation level could easily read the same values twice or skip values completely. (Ignore the complications introduced by my use of system tables for the sake of an easy example). Three different scans could easily, in general, produce three different results. The fact that it is a single statement doesn't change that.

And in situations such as the scenario that brought us here, where the actual subquery occurs only once in the submitted query and it's SQL Server itself that expands it, I'd consider it even more buggy!

That I can see, but if we accept that the 'compressed' CASE statement is purely a syntactical convenience, and that it is merely shorthand for the searched expression...then clearly the subqueries must be evaluated x number of times, due to the fact that data access makes the expression non-deterministic.

I agree that this whole area is a very odd duck though - much like the multiply-referenced CTE issue.




Paul White
SQL Server MVP
SQLblog.com
@SQL_Kiwi
Post #893802
Posted Wednesday, March 31, 2010 8:28 AM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Today @ 7:00 AM
Points: 5,930, Visits: 8,178
Paul White NZ (3/31/2010)
Hugo Kornelis (3/31/2010)
I would agree that, depending on transaction isolation level and @@TRANCOUNT, the results of the three statements might be different on a busy server. But for your code, with the single SELECT, I would consider anything but three equal values a bug.

I don't see why. The COUNT done at the default isolation level could easily read the same values twice or skip values completely.

No. That would be possible if you use NOLOCK or DIRTY_READ, but not at the default level (READ_COMMITTED).

Three different scans could easily, in general, produce three different results. The fact that it is a single statement doesn't change that.

Except that I believe that when it is a single statement, there should not be three different scans. Especially BECAUSE that could return inconsistent results.

But I think we'll just have to agree to disagree on this one! ;)



Hugo Kornelis, SQL Server MVP
Visit my SQL Server blog: http://sqlblog.com/blogs/hugo_kornelis
Post #893823
Posted Wednesday, March 31, 2010 8:42 AM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Today @ 2:50 AM
Points: 11,192, Visits: 11,098
Hugo Kornelis (3/31/2010)
No. That would be possible if you use NOLOCK or READ_UNCOMMITTED, but not at the default level (READ_COMMITTED).

I know you know this! It has been blogged many times, but here's the entry from your 'blog-colleague' Alex K:
http://sqlblog.com/blogs/alexander_kuznetsov/archive/2009/04/10/selects-under-read-committed-and-repeatable-read-may-return-incorrect-results.aspx

Double/missed reads are definitely possible and even quite likely even at REPEATABLE READ.

Except that I believe that when it is a single statement, there should not be three different scans. Especially BECAUSE that could return inconsistent results. But I think we'll just have to agree to disagree on this one! ;)

Ah right, I see your point. You think the three identical sub-queries should be 'collapsed' into one execution by SQL Server. Sounds distinctly dodgy to me




Paul White
SQL Server MVP
SQLblog.com
@SQL_Kiwi
Post #893836
Posted Wednesday, March 31, 2010 8:58 AM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Today @ 2:50 AM
Points: 11,192, Visits: 11,098
I think my 'distinctly dodgy' comment warrants a little expansion

AFAIK, SQL Server has never set out to guarantee anything about a query behaving as if it occurred all-at-once at a single point in time. You can approximate this behaviour, indirectly, using the SERIALIZABLE isolation level. Statement-level consistency can also be achieved using READ_COMMITTED_SNAPSHOT. Transaction-level consistency requires full SNAPHSOT. None of them will explicitly perform the sort of 'expression collapsing' you seem to be after - at least not with anything that accesses data. Compute Scalar operators are frequently used to re-use computed constant values in a query plan, of course.




Paul White
SQL Server MVP
SQLblog.com
@SQL_Kiwi
Post #893848
Posted Wednesday, March 31, 2010 9:14 AM


Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Friday, July 11, 2014 12:42 PM
Points: 1,380, Visits: 2,682
Good question.

Also, I learned a lot more from the discussion from Paul and Hugo.


---------------------------------------------------------------------
Use Full Links:
KB Article from Microsoft on how to ask a question on a Forum
Post #893864
« Prev Topic | Next Topic »

Add to briefcase ««12345»»»

Permissions Expand / Collapse