• Hugo Kornelis (12/23/2012)


    I think you are confusing two things about processing and evaluation order in a CASE expression. Let's look at an example:

    CASE

    WHEN x <> 0 THEN y / x

    WHEN (very complicated subquery expression) = 1 THEN (yet another very complicated subquery)

    ELSE 42

    END

    SQL Server guarantees that "y/x" will only be actually computed when "x <> 0" evaluates to True. It also guarantees that "yet another very complicated subquery" will only be actually computed when "x <> 0" does not, and "(very complicated subquery expression) = 1" does evaluate to True.

    Some people also expect that SQL Server will not even process (very complicated subquery expression) when "x <> 0" evaluates to true. And while that could have been the case (after all, the result of that subquery evaluation will not be used), this, in fact, is not guaranteed. My suspicion after reading your comment is that you either once read, or once were bitten by this latter behaviour, and then erroneoulsy expanded into no longer trusting SQL Server not to evaluate the THEN if the WHEN does not evaluate to True.

    As to whether the possible results are evaluated, there appears to be no documentation. My memory tells me that I was bitten way back when by a zero divide error in a result branch that should not have been evaluated, but it was a long time ago. Perhaps my memory is playing me false, or perhaps it was conditioned by seeing that "2" in the question. Maybe it was in some other dbms than sql-server - I can't even remember whether my scratch pad variables were decorated with "@" or not; but of course in a dbms with case statements (in addition to case expressions) the execution of code in unwanted branches would be an pretty disastrous but, so that wouldn't have happened in, for example, sql-anywhere. But regardless of whether an error can occur through evaluation of an unneccessary result expression (and I believe it can't, because I know you are a reliable source of information on that sort of thing) my comment about the implementation being broken stands: if T-SQL is going to try some speculative execution for some reason, it must catch errors in that speculative execution internally and only throw them on to the user-visible level if the speculative execution turns out to have been needed, and it doesn't matter a bit whether what is being speculatively executed is a when-expression, and boolean when-expression, or a result branch.

    It isn't particularly surprising that some people expect the evaluation of when clauses or boolean clauses to stop at first true, because the text of BoL explicitly states that for a searched case expression (which is what we have here, since the comparison is <> not =) the boolean expressions are evaluated in order. Before SQL 2008 R2 it didn't say that evaluation of these expressions stops when true is met, but that's what most people would expect when it's stated that evaluation is done in order. Up to and including SQL 2008 BoL made no reference to the possibility of a when-boolean-expression being evaluated after the first true. So either the documentation was misleading and should have been changed to say that they may all be evaluated even if a true is met before the last boolean, or the implementation should be changed to stop at the first true.

    In 2008 R2 BoL a new remark was introduced into the case page. This begins "The CASE statement evaluates its conditions sequentially and stops with the first condition whose condition is satisfied." That is plainly false. It then goes on to say "In some situations, an expression is evaluated before a CASE statement receives the results of the expression as its input". This must have been intended to say that there are exceptions to the preceding sentence, but since it is essential for an expression to be evaluated before anything receives its result (since until it is evaluated its result is[are] unknown) it doesn't actually say that, it's just a pointless tautology. Of course it seems very unlikely that it was intended to mean what it does say (which can be paraphrased as "in some situations we have failed to to build a temporal paradox into the implementation of case statements") - whoever wrote that addition to BoL apparently wasn't very good at writing clear English, and also seems to have been unaware that T-SQL has case expressions but (unlike several other SQL implementations) no case statements. Anyway, the documentation is now (for 2008 R2 and 2012) clearly wrong, not just misleading (worse than before 2008 R2), and definitely needs fixing (although it would be better to fix the implementation - documenting that the code can throw unneeded errors becauses it indulges in unprotected speculative execution would be a step in the right direction but not a proper fix).

    When it comes to trusting SQL Server (and the things that I associate with it, like ADO) I don't - I've seen too many things broken by a new service pack or a new release, or even by a critical update (and sometimes the change in behaviour was documented, more often not). I've seen the argument that the optimizer, not the semantics, is supreme rather too often to trust it, particularly since in some cases where I felt it was unjustifiable. I do think SQL Server is better than its competition, though - it's what I recommend people to use. But I do tell them to test everything, and have a good try at breaking their code before they believe it will be anything like reliable enough for production.

    Tom