An interesting thing about isnull

Question

An interesting thing about isnull

richard.maw

SSC Veteran

Points: 226
More actions
March 30, 2010 at 8:12 pm

#220733

Comments posted to this topic are about the item An interesting thing about isnull

Viewing 15 posts - 1 through 15 (of 44 total)

You must be logged in to reply to this topic. Login to reply

Fal Hall of Fame Points: 3143 More actions · Answer 1

This QotD raises a point I'm always having to remind developers about, and I guess SQL guys can also do with a reminder from time to time.

However, my 2 cents is rather than...

declare @s-2 char(4)

select ISNULL(convert(varchar(16),@s),'Unknown')

...I would use...

declare @s-2 char(4)

select coalesce(@s,'Unknown')

...which also has the ability to take more than 2 arguments.

S.

SQLRNNR SSC Guru Points: 281334 More actions · Answer 2

Nice question

Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events

SQLRNNR SSC Guru Points: 281334 More actions · Answer 3

Fal (3/30/2010)
This QotD raises a point I'm always having to remind developers about, and I guess SQL guys can also do with a reminder from time to time.
However, my 2 cents is rather than...
declare @s-2 char(4)
select ISNULL(convert(varchar(16),@s),'Unknown')
...I would use...
declare @s-2 char(4)
select coalesce(@s,'Unknown')
...which also has the ability to take more than 2 arguments.
S.

I, too, prefer the coalesce for this same reason. However, you will find that many prefer the isnull due to an increase in performance.

Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events

richard.maw SSC Veteran Points: 226 More actions · Answer 4

It's also interesting that coalesce behaves a different way from isnull.

My point in giving the isnull/convert formulation was to show how to make isnull behave in the way you might expect. I wasn't particularly recommending it.

Can anyone give an account of why the semantics of isnull and coalesce (with two args) is different?

vk-kirov SSCertifiable Points: 7686 More actions · Answer 5

richard.maw (3/30/2010)
Can anyone give an account of why the semantics of isnull and coalesce (with two args) is different?

Here is a nice article about this: http://blogs.msdn.com/sqltips/archive/2008/06/26/differences-between-isnull-and-coalesce.aspx

COALESCE basically translates to CASE expression and ISNULL is a built-in implemented in the database engine.
...
COALESCE is based on the ANSI SQL standard whereas ISNULL is a proprietary TSQL function

James Lean SSCertifiable Points: 5757 More actions · Answer 6

CirquedeSQLeil (3/30/2010)
However, you will find that many prefer the isnull due to an increase in performance.

Do you have a supporting link or further information on the performance difference between ISNULL and COALESCE? I was unaware of any significant difference between the two?

-----
JL

Hugo Kornelis SSC Guru Points: 64780 More actions · Answer 7

James Lean (3/31/2010)
CirquedeSQLeil (3/30/2010)
However, you will find that many prefer the isnull due to an increase in performance.
Do you have a supporting link or further information on the performance difference between ISNULL and COALESCE? I was unaware of any significant difference between the two?

COALESCE is implemented as a quite literal implementation of its definition in the ANSI standard (which is: COALESCE(a, b, ...) is equivalent to CASE WHEN a IS NOT NULL THEN a WHEN b IS NOT NULL THEN b ... ELSE NULL END).

So if any of the arguments is a subquery, the implementation if COALESCE((subquery), ...) becomes CASE WHEN (subquery) IS NOT NULL THEN (subquery) ... ELSE NULL END. And the optimizer that sits a bit further down the line does apparently not realize that the two subqueries are identical, so it happily produces a plan that will evaluate the same subquery twice.

Without the subqueries, I have never witnessed a measurable performance difference.

Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
Visit my SQL Server blog: https://sqlserverfast.com/blog/
SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

James Lean SSCertifiable Points: 5757 More actions · Answer 8

Thanks Hugo. That would seem to be a very specific case, I wouldn't have thought it was a good reason to make a general case for using COALESCE over ISNULL.

Personally I do tend to use COALESCE, even with only two arguments. That way, if you do need to add more arguments in future you don't have to worry about changing it. Unless I do need to take advantage of the implicit CASTing to the first argument's type, in which case I'll use ISNULL.

As always, I suppose, "it depends". 😉

-----
JL

Paul White SSC Guru Points: 150467 More actions · Answer 9

Hugo Kornelis (3/31/2010)
So if any of the arguments is a subquery, the implementation if COALESCE((subquery), ...) becomes CASE WHEN (subquery) IS NOT NULL THEN (subquery) ... ELSE NULL END.

That reminds me of a CASE construction like the following:

SELECT CASE CHECKSUM(NEWID()) % 3 + 1

WHEN 1 THEN 'One'

WHEN 2 THEN 'Two'

WHEN 3 THEN 'Three'

ELSE 'WTF?'

END

...which quite frequently hits the ELSE, and that surprises some people.

And the optimizer that sits a bit further down the line does apparently not realize that the two subqueries are identical, so it happily produces a plan that will evaluate the same subquery twice.

Once expanded by the CASE, is it true to say that they are identical? It seems to me that the results are non-deterministic, so it would not be safe to collapse these evaluations. (One might argue that it would be safe at the SERIALIZABLE isolation level (or one of the row-versioning isolation levels), but the plan would not then be reusable).

I blame the expansion of the CASE - which is apparently a documented design decision. I can live with it, though.

Last thing: the optimizer does a great job at avoiding the extra subquery evaluations where possible. The plan produced is quite likely to feature a PassThrough expression on the joins, so that the subquery is not re-evaluated if a prior term in the COALESCE already produced a non-NULL value.

Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi

Paul White SSC Guru Points: 150467 More actions · Answer 10

James Lean (3/31/2010)
That would seem to be a very specific case, I wouldn't have thought it was a good reason to make a general case for using COALESCE over ISNULL. Personally I do tend to use COALESCE, even with only two arguments. That way, if you do need to add more arguments in future you don't have to worry about changing it. Unless I do need to take advantage of the implicit CASTing to the first argument's type, in which case I'll use ISNULL. As always, I suppose, "it depends". 😉

Indeed it does depend.

I think it's a shame, in a way, that almost every QotD or thread that mentions ISNULL or COALESCE always attracts a certain element determined to prove that one is always better than the other. (I'm not including you in that, by the way)

My own view is that the two functions are very different, and each has its merits. I use both regularly, depending on the query I am writing. I realise that such a balanced view lacks any sort of excitement factor, but there it is 🙂

Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi

sistemas 95572 SSC Enthusiast Points: 135 More actions · Answer 11

sistemas 95572

SSC Enthusiast

Points: 135

March 31, 2010 at 6:55 am

#1143044

lol owned

good question

Hugo Kornelis SSC Guru Points: 64780 More actions · Answer 12

Paul White NZ (3/31/2010)
Hugo Kornelis (3/31/2010)
So if any of the arguments is a subquery, the implementation if COALESCE((subquery), ...) becomes CASE WHEN (subquery) IS NOT NULL THEN (subquery) ... ELSE NULL END.
That reminds me of a CASE construction like the following:
SELECT CASE CHECKSUM(NEWID()) % 3 + 1
WHEN 1 THEN 'One'
WHEN 2 THEN 'Two'
WHEN 3 THEN 'Three'
ELSE 'WTF?'
END
...which quite frequently hits the ELSE, and that surprises some people.

That is caused by a related and similar reason. This "simple" CASE is also defined in ANSI in terms of the "searched" CASE (and apparently implemented in this way):

SELECT CASE

WHEN CHECKSUM(NEWID()) % 3 + 1 = 1 THEN 'One'

WHEN CHECKSUM(NEWID()) % 3 + 1 = 2 THEN 'Two'

WHEN CHECKSUM(NEWID()) % 3 + 1 = 3 THEN 'Three'

ELSE 'WTF?'

END;

If you had replaced the CHECKSUM expression with a complex subquery and added another 7 WHEN clauses, the end result would be a query that evaluates the same subquery ten times in a row! <shudder>

And the optimizer that sits a bit further down the line does apparently not realize that the two subqueries are identical, so it happily produces a plan that will evaluate the same subquery twice.
Once expanded by the CASE, is it true to say that they are identical? It seems to me that the results are non-deterministic, so it would not be safe to collapse these evaluations. (One might argue that it would be safe at the SERIALIZABLE isolation level (or one of the row-versioning isolation levels), but the plan would not then be reusable).

I disagree. If the query includes some non-deterministic code (such as NEWID()), there might be room for argument. But if the query in itself is deterministic, than the results should be deterministic. A query is supposed to behave as if it is evaluated at a single moment in time; implementation choices should never influence the results.

Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
Visit my SQL Server blog: https://sqlserverfast.com/blog/
SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

Paul White SSC Guru Points: 150467 More actions · Answer 13

Hugo Kornelis (3/31/2010)
That is caused by a related and similar reason. This "simple" CASE is also defined in ANSI in terms of the "searched" CASE (and apparently implemented in this way):

Yep, you can see that by examining the query plan produced - the CASE is always expanded to the searched form.

I disagree. If the query includes some non-deterministic code (such as NEWID()), there might be room for argument. But if the query in itself is deterministic, than the results should be deterministic.

I do see your point, but I'm not sure I agree. Once the decision has been taken to expand the CASE, the sub-queries must be re-evaluated. The fact that data access is involved makes the result non-deterministic - the result depends on the state of the database.

We would not expect the following code to be 'collapsed' into a single sub-query execution:

SELECT result1 = (SELECT COUNT(*) FROM sys.objects),

result2 = (SELECT COUNT(*) FROM sys.objects),

result3 = (SELECT COUNT(*) FROM sys.objects)

A query is supposed to behave as if it is evaluated at a single moment in time; implementation choices should never influence the results.

I think this is the crux of the matter. You are only guaranteed this sort of behaviour when running at SERIALIZABLE or one of the row-versioning isolation levels, as I mentioned before. (Also see the example above)

Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi

Hugo Kornelis SSC Guru Points: 64780 More actions · Answer 14

Paul White NZ (3/31/2010)
Hugo Kornelis (3/31/2010)
I disagree. If the query includes some non-deterministic code (such as NEWID()), there might be room for argument. But if the query in itself is deterministic, than the results should be deterministic.
I do see your point, but I'm not sure I agree. Once the decision has been taken to expand the CASE, the sub-queries must be re-evaluated. The fact that data access is involved makes the result non-deterministic - the result depends on the state of the database.
We would not expect the following code to be 'collapsed' into a single sub-query execution:
SELECT result1 = (SELECT COUNT(*) FROM sys.objects),
result2 = (SELECT COUNT(*) FROM sys.objects),
result3 = (SELECT COUNT(*) FROM sys.objects)

Actually, I would expect exactly that to happen. This is a single query, supposed to return results as if it were executed at a single moment in time. Transaction isolation levels govern what happens if multiple statements are executed in succession. So if I change your code to

SELECT @result1 = (SELECT COUNT(*) FROM sys.objects);

SELECT @result2 = (SELECT COUNT(*) FROM sys.objects);

SELECT @result3 = (SELECT COUNT(*) FROM sys.objects);

then I would agree that, depending on transaction isolation level and @@TRANCOUNT, the results of the three statements might be different on a busy server. But for your code, with the single SELECT, I would consider anything but three equal values a bug.

And in situations such as the scenario that brought us here, where the actual subquery occurs only once in the submitted query and it's SQL Server itself that expands it, I'd consider it even more buggy!

Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
Visit my SQL Server blog: https://sqlserverfast.com/blog/
SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/