More performant way to write this where clause

  • Ahoi,

    i have the following where clause which was created by my predecessor and i wonder if there was a better way to write this since all these OR are not very efficient:

     where(
    (
    (
    a.charge_id= c.bewertungsart_idand
    a.bewertungstyp_id != ' 'and
    a.chargenpflicht_Kz = ' '
    )
    or
    (
    c.bewertungsart_id= ' 'and
    a.bewertungstyp_id = ' 'and
    a.chargenpflicht_Kz = 'X'
    )
    or
    (
    a.bewertungstyp_id != ' '
    and a.chargenpflicht_Kz = 'X'
    and a.Bewertungsart_ID = c.Bewertungsart_ID
    )
    )
    and a.Bestand_Id in ('MCHB','MSKA')
    ) OR
    (
    a.Chargenfuehrung_Kz = ' '
    and c.bewertungsart_id= ' '
    and a.Bestand_Id in ('MARD','MSKA')
    )

    The only things i could think of:

    - replacing the strings with a bit fields for these true/false checks

    - i attempted something trying to compare it to hardcoded strings an in (...)

    But it doesnt produce the same result i guess he has some issues with the empty strings, not sure if it would be faster in the first place

    (a.Chargenfuehrung_Kz = ' 'and c.bewertungsart_id = ' ' and a.Bestand_Id in ('MARD','MSKA'))
    a.Chargenfuehrung_Kz+c.bewertungsart_id+a.Bestand_Id in ('  MARD','  MSKA')

    I wonder if there is better way to write all this OR statements

     

     

  • That WHERE makes it look like your query is suffering from poor alias choices. Bad habits to kick : using table aliases like (a, b, c) or (t1, t2, t3)

    What you have there seems "ok", but difficult to honestly tell. The clause a.Chargenfuehrung_Kz+c.bewertungsart_id+a.Bestand_Id in (' MARD',' MSKA') is likely to be a bad idea though, as it's likely to lower the SARGability of the query (though the complexity of the OR isn't going to be helping the Data Engine).

    Depending on your goals, you might be better off with a lookup table.

    Thom~

    Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
    Larnu.uk

  • Thom A wrote:

    That WHERE makes it look like your query is suffering from poor alias choices. Bad habits to kick : using table aliases like (a, b, c) or (t1, t2, t3)

    What you have there seems "ok", but difficult to honestly tell. The clause a.Chargenfuehrung_Kz+c.bewertungsart_id+a.Bestand_Id in (' MARD',' MSKA') is likely to be a bad idea though, as it's likely to lower the SARGability of the query (though the complexity of the OR isn't going to be helping the Data Engine).

    Depending on your goals, you might be better off with a lookup table.

     

    Yeah its hard to tell and i agree, even though the alias choice was not made by me i still catch myself doing this aswell.

    Thanks for the Lookup hint, ill take a look into that.

    This was the format before i rearragend it, in my opinion absolutely unreadable:

     where (((a.charge_id= c.bewertungsart_idand a.bewertungstyp_id   != ' '       
    and a.chargenpflicht_Kz = ' ') or
    (c.bewertungsart_id= ' 'and a.bewertungstyp_id = ' '
    and a.chargenpflicht_Kz = 'X') or
    ((a.bewertungstyp_id != ' 'and a.chargenpflicht_Kz = 'X' and
    a.Bewertungsart_ID = c.Bewertungsart_ID)))
    and a.Bestand_Id in ('MCHB','MSKA'))
    OR
    (a.Chargenfuehrung_Kz = ' '
    and c.bewertungsart_id= ' '
    and a.Bestand_Id in ('MARD','MSKA'))
  • Ok, i found a to rewrite the query producing the same result:

    --previously:
    (
    a.charge_id = c.bewertungsart_id and
    a.bewertungstyp_id != ' ' and
    a.chargenpflicht_Kz = ' '
    )
    or

    (
    a.bewertungstyp_id != ' '
    and a.chargenpflicht_Kz = 'X'
    and a.Bewertungsart_ID = c.Bewertungsart_ID

    --now:
    (
    a.bewertungstyp_id != ' ' AND
    a.chargenpflicht_Kz in (' ','X') AND
    (
    a.charge_id = c.bewertungsart_id OR
    a.Bewertungsart_ID = c.Bewertungsart_ID
    )

    )

     

    Now i have an additional question:

    Does the ORDER matter if you use in ('...','.....)?

    Example:

    chargenpflicht_Kz in (' ','X')
    --VS
    chargenpflicht_Kz in ('X',' ')

    Knowing there is way more entries with 'X', is there performance boost or will this be corrected by the query optimizer anyway?

    counter

    Attachments:
    You must be logged in to view attached files.
  • Since an IN gets converted to ... OR ... OR ..., I suppose it might make a difference, in which case you'd want to put the most common values first.

    You have a possible issue in your WHERE clause above, in that you need to verify the specific value of a.chargenpflicht_Kz for each condition.  This will better match your original logic while also pre-screening a.chargenpflicht_Kz:

            (
    a.bewertungstyp_id != ' ' AND
    a.chargenpflicht_Kz in (' ','X') AND
    (
    (a.charge_id = c.bewertungsart_id AND
    a.chargenpflicht_Kz = ' ') OR
    (a.Bewertungsart_ID = c.Bewertungsart_ID AND
    a.chargenpflicht_Kz = 'X')
    )
    )

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • ScottPletcher wrote:

    Since an IN gets converted to ... OR ... OR ..., I suppose it might make a difference, in which case you'd want to put the most common values first.

    You have a possible issue in your WHERE clause above, in that you need to verify the specific value of a.chargenpflicht_Kz for each condition.  This will better match your original logic while also pre-screening a.chargenpflicht_Kz:

            (
    a.bewertungstyp_id != ' ' AND
    a.chargenpflicht_Kz in (' ','X') AND
    (
    (a.charge_id = c.bewertungsart_id AND
    a.chargenpflicht_Kz = ' ') OR
    (a.Bewertungsart_ID = c.Bewertungsart_ID AND
    a.chargenpflicht_Kz = 'X')
    )
    )

     

    Oh thanks for the hint, didnt realise im double checking on the same attribute twice

  • It's possible to get rid of "or" in where clause using DeMorgan's Law.  If the goal is readable code tho... sometimes that pretty much goes out the window.

    Some test data:

    drop table if exists test;
    go
    create table test(
    col_valuechar(1) unique not null);
    go
    insert test(col_value) values ('a'),('b'),('c'),('d'),('e');
    go
    /* insert 5 rows of test data */

    Equivalency #1:

    select * from test
    where
    (col_value = 'a'
    or col_value = 'b');
    /* returns 2 records: a,b */

    /* is logically equivelent to */

    select * from test where col_value = 'a'
    union
    select * from test where col_value = 'b';
    /* returns 2 records: a,b */

    Equivalency #2:

    select * from test
    where
    col_value not in('a','b')
    /* returns 3 records: c,d,e */

    /* is logically equivelent to */

    select * from test where col_value != 'a'
    intersect
    select * from test where col_value != 'b';
    /* returns 3 records: c,d,e */

     

     

    Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

  • Sometimes

    select * from test where col_value = 'a'
    union all
    select * from test where col_value = 'b';
  • Jonathan AC Roberts wrote:

    Sometimes

    They're distinct.  Is because performance?

    Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

  • Steve Collins wrote:

    Jonathan AC Roberts wrote:

    Sometimes

    They're distinct.  Is because performance?

    Sorry, it looks like the end of my post was somehow truncated.

    I meant to say that UNION ALL is more efficient than UNION as UNION involves a sort and merge. If you have the right indexes this is faster than using an OR.

  • Aha, gotcha.  I see what you mean about sometimes also.  Sometimes it matters even tho they're distinct.  Thank you.

    Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

  • Since an IN () gets converted to ... OR ... OR ..., <<

    That is only a current SQL Server thing. Other SQL products (and Microsoft will probably catch up someday) can turn the IN() predicate a hash table, a binary search tree, and other fast-access storage methods when the list is long enough. This leads to the programmers trick of patting out the list of values with dummies to hit the magic length and trigger the optimization.

    My recommendation is that you anticipate the future and use the IN() predicate out of habit. It also improves your code when exported. And it just looks more SQL-ly so people don't think you're a BASIC programmer 🙂

    Please post DDL and follow ANSI/ISO standards when asking for help. 

  • Steve Collins wrote:

    Jonathan AC Roberts wrote:

    Sometimes

    They're distinct.  Is because performance?

    If you don't use the union all you could change the actual output of the query (in which case UNION will not be equivalent to OR).  Given the WHERE clause - you the 2 part the union don't have anything in common, but nothing so far cares or assumes that each of the parts only have distinct rows.

    For example - if you were union bunch of bank deposits together you REALLY don't want to deduping rows from bank account 'a even if you do find 50 100$ deposits on the same day.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply