Twenty tips to write a good stored procedure

  • arup_kc (8/24/2009)


    Unfortunately, in this case, you only substitted one form of RBAR for another form of RBAR. The best solution in this case is to find a set-based solution to replace a cursor-based solution (and a WHILE + Temp Table is still a cursor-based solution).

    RBAR = Row By Agonizing Row (A Modenism)

    Hi Lenn,

    What you are telling is absolutely correct. Some there are some situations where we cant avoid cursors...I am telling abt that situation. Also, I have checked the Temp+While solutions, indeed it gave better performance...thats why I included that in this article.[/quote]

    Situations that require cursors or while/temp tables are rare and usually are found in maintenance routines. I have been in the situation where I have seen them used and yes they worked, and unfortunately the attempts I tried to eliminate the cursor failed (ended up with what I now know of as a triangular join, just as bad if not worse), but now that I have learned more I'm not in a position anymore to try and eliminate the cursor (I don't work there anymore).

  • Andy DBA (8/24/2009)


    rja.carnegie (8/24/2009)


    Andy DBA (8/18/2009)

    For example:

    WHERE somecol NOT IN (1,2, NULL)

    is equivalent to

    WHERE NOT (somecol = 1 OR somecol = 2 OR somecol = NULL)

    Wait, isn't that backwards? IN is equivalent to a list of ORs, NOT IN is equivalent to a list of NOT... AND NOT... AND NOT?

    It's not backwards because I still have the NOT outside the parens. The important point here is that because there is a NULL comparison in the expression, the entire expression evaluates to UNKNOWN regardless of the other items in the list or the value in the comparison column. If you're trying to produce a list of records without matching records in another table, (sometimes called "orphans" eg. customers without address records) a NULL in the comparison list will cause the NOT IN query to NEVER return any results. Any orphans that may be present will be missed.

    The NOT EXISTS and LEFT JOIN queries don't have this problem and can even be coded to treat NULL = NULL as a "match" if desired. I wouldn't design a new system like that, but you don't always have control over the data set or business rules.

    Andy, only change I'd make to your code is this:

    ... OR somecol IS NULL) ...

    If you then moved the NOT inside the parens, that would change to this:

    ... AND somecol IS NOT NULL) ...

  • Lynn Pettis (8/24/2009)


    Andy DBA (8/24/2009)


    rja.carnegie (8/24/2009)


    Andy DBA (8/18/2009)

    Lynn Pettis (8/24/2009)

    Andy, only change I'd make to your code is this:

    ... OR somecol IS NULL) ...

    If you then moved the NOT inside the parens, that would change to this:

    ... AND somecol IS NOT NULL) ...

    NO NO NO NO :pinch: Please re-read my original post. It's not "my code". It's the equivalent of what NOT IN does. The whole point is that NOT IN doesn't do an IS NULL comparison. It does an = comparison to each item in the list. If there's a NULL, it does the equvivalent of ...OR somecol = NULL..., so it will NEVER NEVER NEVER return TRUE and the WHERE clause will ALWAYS fail regardless of the value in somecol. (Sorry about the SHOUTING, it's Monday morning and I'm cranky.:w00t:)

    To see this please run the DDL from my original post and then run the following:

    SELECT p.* FROM parent p WHERE p.match_col NOT IN (SELECT c.match_col FROM child c)

    This will return an empty result set even though the data set has orphans. Please then run the following to list the orphans.

    --NULL compared to NULL not considered a match

    SELECT P.* FROM parent p LEFT JOIN child C ON P.match_col = C.match_col

    WHERE C.childid IS NULL

    --NULL compared to NULL is considered a match

    SELECT p.* FROM parent P

    LEFT JOIN child C ON P.match_col = C.match_col OR (P.match_col IS NULL AND C.match_col IS NULL)

    WHERE C.childid IS NULL--notice we're checking the childid here and not match_col

    So, my point is: The NOT IN query doesn't work if the comparison list contains NULLS.

  • Not disputing the NOT IN not working if nulls exist in the data.

    I'm disputing the explicit code that has been shown. First, whoever wrote should not have used somecol = NULL. And if you pull the NOT inside the paranes, the change that should be made.

    Sorry for the confusion.

  • Lynn Pettis (8/24/2009)First, whoever wrote should not have used somecol = NULL. And if you pull the NOT inside the paranes, the change that should be made.

    Sorry for the confusion.

    I disagree. I wrote "somecol = NULL" to illustrate why NOT IN doesn't work if nulls exist in the data. Of course it's incorrect to use it in a query, but it accurately represents what IN does and it's why the NOT IN query doesn't work if nulls exist in the data. With ANSI NULLS ON, "somecol = NULL" evaluates to UNKNOWN instead of TRUE or FALSE.

    Once UNKNOWN is in the expression, the NOT doesn't matter because NOT UNKNOWN is the same as UNKNOWN. If readers are focused on changing "NOT(expression1 OR expression2 OR UNKNOWN)" to "(NOT expression1) AND (NOT expression2) AND (NOT UNKNOWN)", they're probably distracted by a simple boolean logic exercise instead of focusing on the reason why the query doesn't work.

  • I understand that what you wrote was an illistration of how IN works. My point is if you actually write a WHERE clause like that, you would use OR somecol IS NULL, not OR somecol = NULL. Two different things. Have I clarified my post yet? I fully understand yours.

  • Yes. It is now clear that your post was about evaluating NULLs in WHERE clauses. I agree that it is correct to use "is NULL" as shown in three of the example queries in my 8/18/2009 post. I apologize for incorrectly assuming you misinterpreted my illustration of how IN works.

  • Arup has updated the content and I have replaced this content with the updates.

  • Thought I'd get in with the first comment, only a spelling mistake that gets through a spell checker:

    Update: I would recommend you to use SET NOCOUNT ON for the shake of performance unless there is a very good reason for using it.


  • 10. WHERE clauses - In a WHERE clause, the various operators used directly affect how fast a query can run. Here are the conditional operators used in the WHERE clause, ordered by their precedence.

    =, >, =, <=, , !=, !>, !<

    for details, refer to the article: http://msdn.microsoft.com/en-us/library/ms190276.aspx

    Don't all the operators have the same precedence? From reading the article they are all at level 4.


  • 11. More WHERE clause hints - Avoid unnecessary conditions in the WHERE Clause. Also try to avoid a function in the WHERE clause as it presents SQL engine to do index seek. Even it forces SQL full index scans or even table scans.

    prevents the SQL engine doing an

  • Lynn Pettis (8/24/2009)


    I understand that what you wrote was an illistration of how IN works. My point is if you actually write a WHERE clause like that, you would use OR somecol IS NULL, not OR somecol = NULL. Two different things. Have I clarified my post yet? I fully understand yours.

    There could be a role for this IN equivalent (I hope),

    WHERE ( col = @value1 OR col = @value2 OR col = @value3 )

    in which one or more of @value1, @value2, @value3 may be NULL, of course.

    Is it better in that case to rewrite the query separately with one condition fewer, or to re-use one version? I would like to think that the server is smart enough not to spend much time on the comparison to NULL, which counts against writing a separate version for that case.

  • jacroberts (8/25/2009)



    11. More WHERE clause hints - Avoid unnecessary conditions in the WHERE Clause. Also try to avoid a function in the WHERE clause as it presents SQL engine to do index seek. Even it forces SQL full index scans or even table scans.

    should be "prevents the SQL engine doing an".

    And without objecting to the general point, I think in some queries, there'll be one condition that does the heavy work of selecting the records that you're playing with and then some kind of internal intermediate resultset that other conditions are applied to, so that your function is evaluated for 10 rows instead of 10 million (ouch). But on the other hand, the server may decide to select rows only on on one condition, separately select rows on another condition, and then compare and match up the two resultsets to produce the query result. I don't feel verybad that I don't recall correct terminology for this sort of thing, because that wouldn't get make much closer to saying what I mean, but... did I have a point?

    Well, I guess (is this a dummy-level comment?) that if your query strictly looks like

    "SELECT * FROM table WHERE ( dbo.ufn_validity(@nearRowID) = 1 )"

    and you know that the following gets the same

    results,

    "SELECT * FROM table WHERE ( dbo.ufn_validity(@nearRowID) = 1 AND rowNumber BETWEEN @nearRowID-20 AND @nearRowID+20 )"

    then the second version may leave the first for dead-slow.

  • arup_kc (8/24/2009)


    Unfortunately, in this case, you only substitted one form of RBAR for another form of RBAR. The best solution in this case is to find a set-based solution to replace a cursor-based solution (and a WHILE + Temp Table is still a cursor-based solution).

    RBAR = Row By Agonizing Row (A Modenism)

    Hi Lenn,

    What you are telling is absolutely correct. [font="Arial Black"]Some there are some situations where we cant avoid cursors...I am telling abt that situation.[/font] Also, I have checked the Temp+While solutions, indeed it gave better performance...thats why I included that in this article.

    Heh... ya just gotta know that will get my attention... Now I have to go back and look again. 😉

    And the fellow's name is "Lynn", not "Lenn". 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • jacroberts (8/25/2009)



    10. WHERE clauses - In a WHERE clause, the various operators used directly affect how fast a query can run. Here are the conditional operators used in the WHERE clause, ordered by their precedence.

    =, >, =, <=, , !=, !>, !<

    for details, refer to the article: http://msdn.microsoft.com/en-us/library/ms190276.aspx

    Don't all the operators have the same precedence? From reading the article they are all at level 4.

    Yep.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 226 through 240 (of 244 total)

You must be logged in to reply to this topic. Login to reply