Any way to clean up mutually exclusive OR conditions?

  • I have a where clause with some mutually exclusive OR expressions. With a procedural language that allows short circuiting it would look like this:

    a || b || c

    The important thing is that || b implies a is false and || c implies a and b are false. My understanding is because t-sql may execute them in any order, I would need to declare this explicitly:

    a OR (!a AND b) OR (!a AND !b AND c)

    But when a, b, and c are verbose expressions, this gets really ugly and will be error prone during maintenance. Is there anything beyond formatting that can be done to simplify this?

  • Why do you want to short circuit the conditions? It won't improve performance and might actually harm it.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • xr280xr (9/26/2016)


    I have a where clause with some mutually exclusive OR expressions. With a procedural language that allows short circuiting it would look like this:

    a || b || c

    The important thing is that || b implies a is false and || c implies a and b are false. My understanding is because t-sql may execute them in any order, I would need to declare this explicitly:

    a OR (!a AND b) OR (!a AND !b AND c)

    But when a, b, and c are verbose expressions, this gets really ugly and will be error prone during maintenance. Is there anything beyond formatting that can be done to simplify this?

    Actually, your expansion of a || b || c is incorrect. The correct expansion is

    a AND !b AND !c OR -- you need to add that both b and c are false

    !a AND b AND !c OR -- you need to add that c is false

    !a AND !b AND c OR

    a AND b AND c -- you need to add the case when all three are true.

    -- Don't believe me? Try the following

    SELECT 1 ^ 1 ^ 1

    I would have a separate CASE expression for each and return either 1 or 0 as appropriate and then BITWISE XOR the values of the CASE expressions.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • Actually, your expansion of a || b || c is incorrect. The correct expansion is

    a AND !b AND !c OR -- you need to add that both b and c are false

    !a AND b AND !c OR -- you need to add that c is false

    !a AND !b AND c OR

    a AND b AND c -- you need to add the case when all three are true.

    Not true in the context I am speaking of. With procedural short circuiting, if a is true, then b and c would not be evaluated and would therefore be indeterminate (a AND (b OR !b) AND (c or !c) i.e. just a). Yours is truly mutually exclusive so maybe my post title was poorly named. However, my expansion is what I actually need it to do...more of a cascading mutual exclusion.

    Why do you want to short circuit the conditions? It won't improve performance and might actually harm it.

    I don't specifically care about short circuiting. I just need to only evaluate b if a is false and only evaluate c if a and b are false. It's not a performance issue to me, it's the logic I need. I was only pointing out that short circuiting allows that to be written very neatly. I'm hoping for a way to write it more neatly in t-sql or confirmation that it can't be.

  • I was also wondering if instead of mutually exclusive you meant disjoint. Mutually exclusive means that you want to exclude the intersection of two sets, whereas disjoint means that the two sets do not intersect.

    If you do want mutually exclusive then the following would also work.

    SELECT *

    FROM YourTable yt

    CROSS APPLY (

    SELECT 1

    WHERE <your condition 1>

    EXCEPT

    SELECT 1

    WHERE <your condition 2>

    EXCEPT

    SELECT 1

    WHERE <your condition 3>

    ) AS ca(dummy_field) /* you have to return some value, but you don't care

    what that value is as long as each of the subqueries

    returns the same value when their condition is met.

    */

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • xr280xr (9/26/2016)


    I don't specifically care about short circuiting. I just need to only evaluate b if a is false and only evaluate c if a and b are false. It's not a performance issue to me, it's the logic I need. I was only pointing out that short circuiting allows that to be written very neatly. I'm hoping for a way to write it more neatly in t-sql or confirmation that it can't be.

    A CASE expression will do this. It evaluates the subsequent conditions only if all of the preceding conditions evaluate to FALSE or UNKNOWN (NULL). The CASE expression CANNOT return a Boolean value, so generally you want to return some token (usually 'T', 'True' or 1) to indicate when the condition is true and then compare that token to the expected value.

    That being said, be very sure that you really need to evaluate them in the specified order.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • drew.allen (9/26/2016)


    I was also wondering if instead of mutually exclusive you meant disjoint. Mutually exclusive means that you want to exclude the intersection of two sets, whereas disjoint means that the two sets do not intersect.

    My mistake. Sorry for the confusion and thanks for your help!

  • drew.allen (9/26/2016)


    xr280xr (9/26/2016)


    I don't specifically care about short circuiting. I just need to only evaluate b if a is false and only evaluate c if a and b are false. It's not a performance issue to me, it's the logic I need. I was only pointing out that short circuiting allows that to be written very neatly. I'm hoping for a way to write it more neatly in t-sql or confirmation that it can't be.

    A CASE expression will do this. It evaluates the subsequent conditions only if all of the preceding conditions evaluate to FALSE or UNKNOWN (NULL). The CASE expression CANNOT return a Boolean value, so generally you want to return some token (usually 'T', 'True' or 1) to indicate when the condition is true and then compare that token to the expected value.

    That being said, be very sure that you really need to evaluate them in the specified order.

    Drew

    DON'T USE A CASE STATEMENT IN A WHERE CLAUSE. That's just a recipe for disaster, as your queries become non-SARGable. The CASE statement should go in the column list if you need to assign that row to a group. In the WHERE clause, you don't care if it doesn't short circuit the condition.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Luis Cazares (9/26/2016)


    drew.allen (9/26/2016)


    xr280xr (9/26/2016)


    I don't specifically care about short circuiting. I just need to only evaluate b if a is false and only evaluate c if a and b are false. It's not a performance issue to me, it's the logic I need. I was only pointing out that short circuiting allows that to be written very neatly. I'm hoping for a way to write it more neatly in t-sql or confirmation that it can't be.

    A CASE expression will do this. It evaluates the subsequent conditions only if all of the preceding conditions evaluate to FALSE or UNKNOWN (NULL). The CASE expression CANNOT return a Boolean value, so generally you want to return some token (usually 'T', 'True' or 1) to indicate when the condition is true and then compare that token to the expected value.

    That being said, be very sure that you really need to evaluate them in the specified order.

    Drew

    DON'T USE A CASE STATEMENT IN A WHERE CLAUSE. That's just a recipe for disaster, as your queries become non-SARGable. The CASE statement should go in the column list if you need to assign that row to a group. In the WHERE clause, you don't care if it doesn't short circuit the condition.

    Never say, "Never." As with anything else, you have to evaluate it within the context, which we simply don't have. There may be, and likely are, more efficient solutions, but it doesn't appear that the OP is willing to give us enough information to suggest such an alternate. It's entirely possible that other criteria in the query will limit the rowset enough that the fact that this particular piece isn't SARGable may have minimal impact on query performance. We just don't know.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • I can't picture a situation in which a CASE statement could be the only option in a WHERE clause.

    I, however, can see a lack of understanding of SQL. I said never because most newbies can't differentiate a valid reason from an invalid reason. I don't intend to sound harsh, but it's just because I've seen so much bad coding, that I don't trust anyone.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Luis Cazares (9/26/2016)


    I can't picture a situation in which a CASE statement could be the only option in a WHERE clause.

    I, however, can see a lack of understanding of SQL. I said never because most newbies can't differentiate a valid reason from an invalid reason. I don't intend to sound harsh, but it's just because I've seen so much bad coding, that I don't trust anyone.

    When working with a third-party solution, you sometimes have limited options. I have been in situations where using a CASE expression in a WHERE clause was the only option available without invalidating our maintenance contract with that vendor.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • In a Where clause you may have a situation as

    WHERE (a>20) OR

    (b>30 and a<=20) OR

    (c>40 and b<=30 and a<=20)

    The query optimizer can find a parallel process to run these in an efficient fashion if you have an index per column here. Turning these into Union All statements will result into three scans ,not desirable if you do not have the proper indexing.

    In your example, using a || !a & b || ...

    your !a can be represented as a=0 (implying false) to make it SARGable. Without more knowledge as to your data types and actual data this is all I can offer at the moment.

    ----------------------------------------------------

  • xr280xr (9/26/2016)


    I have a where clause with some mutually exclusive OR expressions. With a procedural language that allows short circuiting it would look like this:

    a || b || c

    The important thing is that || b implies a is false and || c implies a and b are false. My understanding is because t-sql may execute them in any order, I would need to declare this explicitly:

    a OR (!a AND b) OR (!a AND !b AND c)

    But when a, b, and c are verbose expressions, this gets really ugly and will be error prone during maintenance. Is there anything beyond formatting that can be done to simplify this?

    That might work in procedural languages, but SQL isn't procedural. It's a bit of a hybrid but is more declarative than procedural.

    Why does this matter? As with most other declarative languages - the SQL engine has latitude over how to answer the query you put together. While there are exceptions, in most cases when you try to play cute and try to specify HOW it operates, performance suffers.... a lot. SQL is designed to operate well on large sets, and most times when you "play", you end up forcing the engine to look at each row one at a time (rather than 100K at a time or faster).

    You REALLY want to verify that things really are slower WITHOUT short-circuiting.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • The usual idiom for this is a CASE expression the usual idiom for this is a CASE expression. I am assuming that you want only one of the three predicates to be true (and the other two false).

    The CASE expression in SQL tests the when clauses in left to right order (no, it does not rearrange them unless the rearrangement is logically identical to the way the expression was written), and returns the then clause value if the when clause test true. If they all fail the else clause returns a value.

    CASE

    WHEN a AND NOT(b) AND NOT(c) THEN 'T'

    WHEN NOT(a) AND b AND NOT(c) THEN 'T'

    WHEN NOT(a) AND NOT(b) AND c THEN 'T'

    ELSE 'F' END = 'F'

    Here is the gimmick; you can read if you write your code correctly, then you can arrange the when clauses to put the most likely true clause to the top of the list. This is not quite the same as short-circuit evaluation, but it is a declarative substitute. When you get a really good SQL compiler, it will keep track of this and might be able to do some rearranging for you based on actual statistics as the program executes.

    The optimizer should be able to see common expressions in the when clauses, and evaluate them only once. This does not save you typing, but it might perform better. That is why my skeleton code shows NOT(); this just makes it easier for parser to pick out duplicate code.

    Books in Celko Series for Morgan-Kaufmann Publishing
    Analytics and OLAP in SQL
    Data and Databases: Concepts in Practice
    Data, Measurements and Standards in SQL
    SQL for Smarties
    SQL Programming Style
    SQL Puzzles and Answers
    Thinking in Sets
    Trees and Hierarchies in SQL

  • drew.allen (9/26/2016)


    Luis Cazares (9/26/2016)


    drew.allen (9/26/2016)


    xr280xr (9/26/2016)


    I don't specifically care about short circuiting. I just need to only evaluate b if a is false and only evaluate c if a and b are false. It's not a performance issue to me, it's the logic I need. I was only pointing out that short circuiting allows that to be written very neatly. I'm hoping for a way to write it more neatly in t-sql or confirmation that it can't be.

    A CASE expression will do this. It evaluates the subsequent conditions only if all of the preceding conditions evaluate to FALSE or UNKNOWN (NULL). The CASE expression CANNOT return a Boolean value, so generally you want to return some token (usually 'T', 'True' or 1) to indicate when the condition is true and then compare that token to the expected value.

    That being said, be very sure that you really need to evaluate them in the specified order.

    Drew

    DON'T USE A CASE STATEMENT IN A WHERE CLAUSE. That's just a recipe for disaster, as your queries become non-SARGable. The CASE statement should go in the column list if you need to assign that row to a group. In the WHERE clause, you don't care if it doesn't short circuit the condition.

    Never say, "Never." As with anything else, you have to evaluate it within the context, which we simply don't have. There may be, and likely are, more efficient solutions, but it doesn't appear that the OP is willing to give us enough information to suggest such an alternate.

    I asked what I was interested in, which was options for the simplification of a fundamental logical condition. I was interested in all cases, not just my current one. I thought someone might have a clever syntactic shortcut. I was curious about the case expression as an alternative which is why I marked it as an answer but I didn't actually end up using it in my current case. To some of the latter posts, again, the goal wasn't to cause short circuiting, but I appreciate your input on the subject.

    Here's what I settled on. AllowsDeletionGracePeriod was a new column so I had to update the where clause to include it.

    DELETE FROM tbl_a

    WHERE [some additional criteria]

    AND (DeleteAsap = 1 OR

    (DeleteAsap = 0 AND AllowsDeletionGracePeriod = 1 AND CreatedDate <= DATEADD(d, -7, GETDATE())) OR

    (DeleteAsap = 0 AND AllowsDeletionGracePeriod = 0 AND (LastTransactionDate IS NULL OR LastTransactionDate <= DATEADD(m, -6, GETDATE())))

    )

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply