Twenty tips to write a good stored procedure

  • rja.carnegie (8/31/2009)


    There could be a role for this IN equivalent (I hope),

    WHERE ( col = @value1 OR col = @value2 OR col = @value3 )

    in which one or more of @value1, @value2, @value3 may be NULL, of course.

    Is it better in that case to rewrite the query separately with one condition fewer, or to re-use one version? I would like to think that the server is smart enough not to spend much time on the comparison to NULL, which counts against writing a separate version for that case.

    Sorry for beating this to death, but

    You HAVE TO rewrite the query separately. If @value3 is NULL, for example, you have to re-write as

    ...WHERE col IN (@value1, @value2) OR col IS NULL

    which will succeed if col matches @value1 or @value2, or if col is NULL.

    ...WHERE col IN (@value1, @value2, @value3 )

    will ALWAYS fail if @value3 is NULL no matter what value is in col.

    But, if your WHERE col IN clause is a subselect::

    ...WHERE col IN (SELECT othercol FROM othertable)

    you don't have that luxury and should avoid using IN if NULLs might be returned. In this simple case just use an INNER JOIN:

    ...INNER JOIN othertable ON col = othercol

    if you don't want to match on NULLs and

    ...INNER JOIN othertable ON col = othercol OR col IS NULL

    if you do.

  • Andy DBA (8/31/2009)


    Sorry for beating this to death, but

    As am I. I think we've covered this ground already, and determined that the counter-intuitive behaviour is limited to NOT IN and NOT EXISTS...

    Andy DBA (8/31/2009)


    You HAVE TO rewrite the query separately. If @value3 is NULL, for example, you have to re-write as

    ...WHERE col IN (@value1, @value2) OR col IS NULL

    which will succeed if col matches @value1 or @value2, or if col is NULL.

    ...WHERE col IN (@value1, @value2, @value3 )

    will ALWAYS fail if @value3 is NULL no matter what value is in col.

    Tsk tsk for the CAPS ;-). Also, what you have stated is incomplete or wrong, depending on how charitable the reader is:

    SET ANSI_NULLS ON;

    DECLARE @T TABLE (A INT NULL);

    INSERT @T VALUES (1);

    INSERT @T VALUES (2);

    INSERT @T VALUES (NULL);

    SELECT A FROM @T WHERE A IN (1, 2) -- 2 rows

    SELECT A FROM @T WHERE A IN (1, 2, NULL) -- 2 rows

    SELECT A FROM @T WHERE A IN (NULL) -- no rows! (query plan is a constant scan)

    SET ANSI_NULLS OFF;

    SELECT A FROM @T WHERE A IN (1, 2) -- 2 rows

    SELECT A FROM @T WHERE A IN (1, 2, NULL) -- 3 rows!

    SELECT A FROM @T WHERE A IN (NULL) -- 1 row!

    SET ANSI_NULLS ON;

    Andy DBA (8/31/2009)


    But, if your WHERE col IN clause is a subselect::

    ...WHERE col IN (SELECT othercol FROM othertable)

    you don't have that luxury and should avoid using IN if NULLs might be returned. In this simple case just use an INNER JOIN:

    ...INNER JOIN othertable ON col = othercol

    if you don't want to match on NULLs and

    ...INNER JOIN othertable ON col = othercol OR col IS NULL

    if you do.

    This adds nothing new. NULL behaviour with IN is well documented and logical.

    Consider:

    SET ANSI_NULLS ON;

    DECLARE @T TABLE (A INT NULL);

    DECLARE @X TABLE (A INT NULL);

    INSERT @T VALUES (1);

    INSERT @T VALUES (2);

    INSERT @T VALUES (NULL);

    INSERT @X VALUES (1);

    INSERT @X VALUES (NULL);

    SELECT A FROM @X WHERE A IN (SELECT A FROM @T) -- 1 row (where A = 1)

    SET ANSI_NULLS OFF;

    SELECT A FROM @X WHERE A IN (SELECT A FROM @T) -- 2 rows (the NULL also matches now)

    SET ANSI_NULLS ON;

    Had you been referring to NOT IN, you would have more of a point. But again, this has been flogged to death already.

    For completeness then:

    SET ANSI_NULLS ON;

    DECLARE @T TABLE (A INT NULL);

    INSERT @T VALUES (1);

    INSERT @T VALUES (2);

    INSERT @T VALUES (NULL);

    SELECT A FROM @T WHERE A NOT IN (1, 2) -- no rows

    SELECT A FROM @T WHERE A NOT IN (1, 2, NULL) -- no rows

    SELECT A FROM @T WHERE A NOT IN (NULL) -- no rows

    SET ANSI_NULLS OFF;

    SELECT A FROM @T WHERE A NOT IN (1, 2) -- 1 row (for the NULL)

    SELECT A FROM @T WHERE A NOT IN (1, 2, NULL) -- no rows

    SELECT A FROM @T WHERE A NOT IN (NULL) -- 2 rows! (for the non-NULL values)

    SET ANSI_NULLS ON;

    SET ANSI_NULLS ON;

    DECLARE @T TABLE (A INT NULL);

    DECLARE @X TABLE (A INT NULL);

    INSERT @T VALUES (1);

    INSERT @T VALUES (2);

    INSERT @T VALUES (NULL);

    INSERT @X VALUES (1);

    INSERT @X VALUES (NULL);

    SELECT A FROM @X WHERE A NOT IN (SELECT A FROM @T) -- No rows

    SET ANSI_NULLS OFF;

    SELECT A FROM @X WHERE A NOT IN (SELECT A FROM @T) -- No rows

    SET ANSI_NULLS ON;

    Paul

  • These are my 2 cents are the OA:

    RE Capitalizing keywords

    IMO, this is an outdated suggestion. Back in the day when people used monochrome editors, it made sense. It might even make sense when posting to a forum which only shows code in the same color but in general, I prefer pascal casing as it is much easier to read. Frankly, what matters most is consistency and next is readability.

    RE: "Use as few as possible variables."

    Nonsense. I challenge the author to show noticiable metrics on the performance difference. IMO, clarity of code is almost always more important than fractional performance differences due to suggestions like this.

    RE: Dynamic queries

    If you use sp_executesql, then you do get the benefit of query plan reuse which the author point out later. Recompilation isn't the reason against dynamic queries. Obfuscation of code and security issues are the primary reasons against it.

    RE: Recompiles

    GilaMonster's post on 10-Aug says it quite well, "recompiles aren't always bad."

    RE: Set vs Select

    Again, clarity of code wins over fractional performance gains. The Set statement illustrates a far clearer intent than a Select statement to set variables. However, using the Select statement to set a bunch of variables should be seen as more of a coding shortcut than a real performance improver.

    RE: Cast vs Convert

    I would go further and emphasize that the reason to Cast is to make your intent clear. Convert can also be used to format the output which should be avoided in database access code.

    RE: Avoid Distinct

    Does Distinct really perform any better than Group By? Probably not. It is obviously a good idea to eliminate unnecessary uses of Distinct.

    RE: Select Into

    My preference would be for developers to avoid Select Into while it is fine for administrative tasks. The reason, again, is clarity of code especially when the resultset contains a large number of columns.

  • Sorry Arup, I wanted to rate this topic as 5 and by mistake I rated 1. Apologies for this.

    The article is very good and helpful.

    Regards,

    Meenakshi Pande

  • In regard to operators, just quickly concerning negating operations, ie... someColumn <> 'string' ,

    don't forget that rows with a null in that column will not be returned. I just use

    where somecolumn != 'string' or someColumn is null. if that what's intended.

    ----------------------------------------------------

Viewing 5 posts - 241 through 244 (of 244 total)

You must be logged in to reply to this topic. Login to reply