performance: ''IF ... IF'' is faster than ''IF ... AND...''??

  • I have two version of a part of code:

    1. (this is oryginal version)

    if (

    (COND1 AND COND2 AND COND3 AND (dbo.function_name(@p1,@p2) =1))

    OR (!COND1 AND COND4)

    ) SET @temp = 1

    2. (this is my vesion)

    if COND1

    BEGIN

    if COND2

    if COND3

    if (dbo.function_name(@p1,@p2) =1)

    SET @temp =1

    END

    else if COND4 SET @temp = 1

    and the second version is about 10x faster than the first version. why? thanks for all suggestions...

  • Very interesting question... can you tell us what the function is doing?

     

    I could understand a very small difference but 10 times faster!!

  • function compares 2 strings, and returns 1 if they're similar. it's like Levenshtein Distance, but for max difference between 2 words = 1 (O(n), not O(n^2)).

    it is possible, that in 1) function always is called?

  • It's simply that every boolean function in the first example is calculated EVERYTIME it's run - even if one or more of the conditions isn't met.

    The second example checks the conditions one at a time, in the sequence you've specified, and if at any point one of those conditions isn't met, the others aren't calculated as it drops out of the 'IF' condition

    As a programmer, I've always been taught to perform this type of operation in the way you've set out the second example - especially where speed is paramount.

    Hope this helps

  • thanks

    but I red, that AND conditions are evaluated them from left to right in the order they are written - thats mean, that every condition must be "prepare"??

  • I'm trying to find a better way to explain it...

    In example 1, ALL the boolean logic is calculated in order to work out whether to set @Temp equal to 1.  It does matter about what is evaluated first where ANDs are concerned as the expression is commutative ie. doesn't matter which side of the AND either of the variables are.

    Let's look at an example. Imagine COND1 AND COND2 are 'true', but all others are false. In the first example COND1, COND2, COND3, COND4 and the result of function_name will all be compared in order to evaluate the full expression, then act on it accordingly. That means 5 conditions are expressly calculated whether their response is needed or not - plus whatever code is hidden in the function_name.

    If the same CONDs were set as above for example 2, then ONLY 2 CONDs will be tested eg COND1 and COND2 because as soon as you get to COND3 and test it, it falls out to the 'Else' - completely ignoring the call to the function. Just because we don't know what the function does in this example, it doesn't mean that time isn't taken by SQL to evaluate it when it needs to.

    In essence, example 1 will call the function_name even when it doesn't have to - as I've tried to explain here.

    If it doesn't have to call functions, or doesn't have to evaluate expressions then that time can be used to move on to the next command/expression/cycle - hence giving you a massive leap in performance.

    Let me know if this doesn't make sense, and I'll try again It's really worth knowing about these simple ways to improve performance.

     

  • ok, thanks.

    but... I expected that there isn't calculated all boolean logic in 1., - for example: if COND1 is false, the rest of expression (with ANDs) isn't required to calculate (like in java)

  • What you're talking about is called short-circuit boolean evaluation. Some compilers/languages perform it automatically, some you can enable it explicitly, and some don't have it at all. I'm really not sure about T-SQL's support.

  • Not sure how you're using it, but "IF" smacks of Cursors, While Loops, and other forms of "RBAR"... if you really want you code to fly, write it set-based with a CASE statement...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • How does the case proceed differently than the if in this case??

  • i have 2 functions:

    f1() - always returns 1

    and f2() - with loop: "while true"

    when I wrote:

    if (dbo.f1() = 2) AND (dbo.f2() = 2) select 1 else select 2

    there was no problem, I received '2'

    but when I change AND to OR,

    if (dbo.f1() = 2) OR (dbo.f2() = 2) select 1 else select 2

    condition are executing without limit

    so, i think T-SQL has short-circuit evaluation...

  • In a function, instead of (over simplified example)...

    If someconda
    set a variable to value 1
    If somecondb
    set a variable to value 2

    ..etc..

    I use,

    SELECT somevariable = CASE
    WHEN someconda THEN value 1
    WHEN somecondb THEN value 2

    ..etc...

    Seems to be like the first snippet in the orginal code... and it is.... the real difference is that if you remove the variable from the SELECT, you end up with setbased code instead of RBAR and you no longer need the function.  Double barreled gain in performance...

    SELECT CASE
    WHEN someconda THEN value 1
    WHEN somecondb THEN value 2
    FROM sometable/joins

    ..etc...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I always wondered about this. What is the cost of calling a deterministic function in a query (executing on every row low exists) compared to using the code directly in the select.

  • I did a bunch of experiments on several different machines (desktop, 4 cpu server, 8 cpu server, etc) about 9 months ago and the results are posted somewhere on this forum (I'll look for it later).  The results were... not all UDF's are bad, not all are good.  On some of the lower powered machines, functions almost always appeared to be a slight to moderate performance hit depending on how they were written.  On the more powerful machines, they actually improved performance in some cases.  In most of those cases, though, they did draw upon some extra resources to run more quickly.

    It's like any other code... you have to try it and measure it against straight code in the system (computer, OS, server, and related software) and weigh that against the utility of having everyone do it (something) the same way and quickly (sometimes rapid development has a bigger savings than long term minor performance improvement).

    In most cases, inline code wins the race (not always, as I said before).  Sometimes it wins big, sometimes not so big.  Sometimes, even if some code is a bit faster, you have to weigh the ability to troubleshoot quickly and readability (and sometimes expected skill level) against getting a large complex job done and future maintainability.  And, you have to consider machine resources, as well.

    An example of that is that I will sometimes write a complex join in the form of multiple joined derived tables with a comment as to what each derived table does.  The optimizer usually doesn't care and the code runs with the same execution plan as a normal single select.  The utility I've gained is that someone with just a little knowledge can maintain the code and the testers love it because they can test each section of the complex join code separately.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Have you tested if there were any difference between a deterministic function and the same function with the option "with schemabinding" which then becomes deterministic to the eyes of the server?

Viewing 15 posts - 1 through 15 (of 19 total)

You must be logged in to reply to this topic. Login to reply