Twenty tips to write a good stored procedure

  • 11. More WHERE clause hints - Avoid unnecessary conditions in the WHERE Clause. You can easily use the first example instead of the second one. The second example uses one extra OR condition which can be avoided using the first example.

    SELECT emp_name FROM table_name WHERE LOWER(emp_name) = 'edu'

    SELECT emp_name FROM table_name WHERE emp_name = 'EDU' OR emp_name = 'edu'

    These two queries are not equivalent, and thus cannot necessarily be substituted one for the other. The first will return a row where emp_name is 'Edu', whereas the second won't. Depending on your requirements that can be a major difference.

  • Paul White (8/10/2009)


    ...unless a RECOMPILE hint is added, in which case the number of rows is available to the optimizer. The deficiency is in the lack of distribution statistics.

    Well, the point the original article made was to use table variables because they don't recompile....

    Plus, while the number of rows total in the table is then known (from the storage engine, interestingly enough), the lack of distribution statistics means that the optimiser's still just guessing as to rows affected by various operators. As far as I can tell, after asking some people about this, is that there's some 'magic number' somewhere in the optimiser code that's used when the total rows in the table is known but the distribution is not. It might be more accurate than guessing 1 row, especially if the data's evenly distributed, but it might not.

    This is not always a bad thing, especially in OLTP queries where loop joins are frequently to be preferred over memory-grant-requiring hashes or merges, and we are often just after the first row.

    But that's only true for smallish row sets. Two large resultsets loop joining is painful.

    If the number of rows is small, the table variable may produce an optimal plan but so, quite likely, will the temp table. And if an OLTP system is crunching thousands of rows to produce one or two, there's something seriously wrong with the queries.

    Table variables are usually fine with small numbers of rows where the optimiser's default estimates are relatively accurate and where plans optimised for 1 row are efficient.

    Editl: I should also mention that the lack of stats is a problem when the table variable is joined to other tables. If all that's been done with the table var is insert into it and then select back all or part, the lack of stats won't have much of an effect. So for a logging table or similar, a table variable is fine.

    btw, merges don't usually require memory grants. Sorts, if one is required to get the rows in order for the merge do, but the merge join itself doesn't unless it's a many-to-many merge join which requires a worktable (in TempDB).

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • jacroberts (8/10/2009)


    ronmoses (8/10/2009)


    10. WHERE clauses - In a WHERE clause, the various operators used directly affect how fast a query can run. Here are the conditional operators used in the WHERE clause, ordered by their performance.

    =, >, =, <=, , !=, !>, !' be much further up the performance list to '!<' as they are both mean exactly the same thing.

    I don't know about performance, but I do know that they are not exactly the same thing. 'Greater than' is not the same as 'Not less than' . The latter will also allow for equality.

  • Grant Fritchey (8/10/2009)


    Where is the author to address the information? Please. This isn't, nor should it be, a public lynching. Your peers have raised serious questions about the work done. You need to address them, even if it's to say, "oops. learned something new."

    Agreed. Everyone makes mistakes at least once in their career. Most of us make them a lot more often than that. Some of us make the mistakes in public, for all to see. 😉

    Making the mistakes isn't what counts, it's how you handle the mistakes after they're pointed out.

    Forum stats show that the author hasn't logged in here since the 24th July. Odd, cause his blog is active.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Eddie Wuerch (8/10/2009)


    Love it when folks incorrectly post about stuff that may be deprecated. After all that info is so hard to come by:

    http://tinyurl.com/kr7tvb (lmgtfy link) 🙂

    One rarely gets a chance to see such carefully implemented sarcasm. (Well, except in a discussion forum where one or more members pull it out regularly.) However, AC surely was referring to features deprecated in the upcoming Microsoft SQL Server Megatron ("Palin"), and later removed outright in SQL Server Galactica ("Skynet"). Spoiling the surprises a bit, but taking 20 years evidently for full adoption of SQL-92 is not bad going in this industry.

    Having looked up "deprecate" in the dictionary, I decide that "use SET @local_variable instead of SELECT @local_variable" does qualify as deprecation, as does what everybody says about cursors. But not in the specific industry meaning of the word, which is also noted there, along with "to pray for deliverance/protection from", which only refers to the more hostile editions of SQL Server 6.x.

    That of course is not the same as "to pray for delivery", which is now forbidden by the Vatican after, well, you remember "The Year of the Three Easters". The faithful now are instructed to wait for at least the first service pack before deploying to production servers, and not to use "the RTM method". And never, never, to mention "Easter eggs", ever again.

  • tom.groszko (8/10/2009)


    Convert also works with Sybase. How do I format a date with CAST as I can with convert.

    You don't, of course. Well, unless you set country code to "ISO". (Joking!)

    A modern product is liable to have some kind of provision for international date types with or without using the same function name, but you also could string together a bunch of DATEPART and DATENAME functions. I also find REPLACE(STR(n, ...), ' ', '0') useful sometimes, but fixing your own dates is being too picky. Mind you, we'll have to do something anyway when we go past the year 9999. Or if anyone has the mixed fortune to go back in time before 1753, which is when the British calendar was made Gregorian.

    http://en.wikipedia.org/wiki/Gregorian_calendar is an interesting (unvalidated) account of a standards body process. You could make a PowerPoint presentation out of that!

  • As for the editorial policy of the site, it's hard to argue with success. This is one of the most successful SQL sites anywhere. Obviously, it has something going for it.

    If I open Management Studio and do Help, Search, for "Twenty tips to write a good stored procedure", I come back here. I'm sure that helps a lot. Mickey Mouse's Let's Begin SQL! would be one of the most visited sites by DB professionals if it had that kind of marketing. If not, then it would depend how you're defining "professional".

    Which speaking of, is why I personally feel obliged to read the daily e-mail about new content. To draw a salary with self-respect, I have to at least pretend that I'm interested in stuff like this!

    So essentially SSC is exploiting our insecurities 🙂

    As for article review or voting, I think there's a possible role for a preview access to read, vote, and comment on a new article if you have a particular interest or want to polish your ego 😉 Have articles seen by a few, particularly well informed and cynical eyes, before giving it the headline!

    If you do the search with Google, several DB bloggers seem to be already excitedly welcoming AC's wise words!

    As a whole, the advice probably will lead many readers to better stored procedures, but not all of the quota of twenty points are contributing equally.

  • Gail,

    I was simply referring to your statement that "Without stats, it estimates 1 row". It is commonly held that SQL Server always estimates one row for table variables - which isn't so.

    GilaMonster (8/11/2009)


    Paul White (8/10/2009)


    This is not always a bad thing, especially in OLTP queries where loop joins are frequently to be preferred over memory-grant-requiring hashes or merges, and we are often just after the first row.

    But that's only true for smallish row sets. Two large result sets loop joining is painful. And if an OLTP system is crunching thousands of rows to produce one or two, there's something seriously wrong with the queries.

    I said especially in, not exclusively in. There are many occasions where a loop join is the best strategy even on large tables - I won't bore you with details, since I am sure you know them already. I can't believe you have never come across a plan which was 'better' when fed from a table variable than a temporary table - especially when a row-goal is set for the optimizer as I mentioned before.

    GilaMonster (8/11/2009)


    btw, merges don't usually require memory grants. Sorts, if one is required to get the rows in order for the merge do, but the merge join itself doesn't unless it's a many-to-many merge join which requires a worktable (in TempDB).

    Ok so let me fix that... "where loop joins are frequently to be preferred over memory-grant-requiring hash or worktable-needing sort-merge (not index-merge) joins". If we're being picky I suppose I should also add that a parallel merge join will indirectly require a memory grant since hash partitioning must be used.

    Paul

  • Grant Fritchey (8/10/2009)


    I do want to raise one point that was mentioned earlier. In 130 or so comments, where is the author? I think he put forward an honest best faith effort to put out a good article, no question. But some of it, based on review, appears to be off. Where is the author to address the information? Please. This isn't, nor should it be, a public lynching. Your peers have raised serious questions about the work done. You need to address them, even if it's to say, "oops. learned something new." For a reference, see here[/url], where I fell, utterly and completely, on my sword, because I messed up a test and completely misinterpreted the results. It happens.

    In defence of the owner, he's written articles for SSC before[/url]. In fact, one of them was (IMHO) a pretty good effort in how to approach query tuning. Not really focussed on specifics, and did note that tables scans are worse that index scans, but they've done a fair job previously.

    Random Technical Stuff[/url]

  • Lynn Pettis (8/10/2009)


    mtassin (8/10/2009)


    I'm sure she wasn't hinting at you. I am pretty sure I have an idea or two who she was thinking about as she wrote this particular post.

    I'm glad to think so. The guys around here tend to think highly of me when it comes to SQL... but then I get to come here and read what those I look up to post and I realize that even after 12 years of this stuff... there's much more to learn 🙂



    --Mark Tassin
    MCITP - SQL Server DBA
    Proud member of the Anti-RBAR alliance.
    For help with Performance click this link[/url]
    For tips on how to post your problems[/url]

  • Paul White (8/10/2009)


    GSquared (8/10/2009)


    Have to sort of disagree with you on this one. Temp tables cause statement recompiles in 2005/8, and proc recompiles in 2000, but even those are often/usually better than the execution plans that you end up with if you use table variables.

    Table variables have no benefits over temp tables except that they are outside of transactions, and that can be an advantage in a few cases (storing data for Try Catch handling to be used after a rollback is one example). Other than being non-transactional, they have no advantages that I know of. Temp tables, on the other hand, have plenty of advantages. They can indexed beyond the PK-clustered index, they have stats, they can easily pass data back and forth between procs, they can be accessed by dynamic SQL, and they can have their first page of data cached for re-use (2005/8) if that's needed, they can have DDL run on them after creation, they can be created by Select Into (very useful when importing data from files where you don't know before-hand what the column sizes need to be), and more.

    I recommend using table variables only if you have to (UDFs, which should be avoided as much as possible anyway for their own reasons), or if you need to hang onto some data despite needing to issue a rollback command.

    That's less than entirely accurate. 🙂

    Rather than tackle each point, I would encourage you to read This TechNet Article, especially the section titled 'Improvements in SQL Server 2005'. Also check out This One to clarify the situation regarding (re)compilation.

    Some points are worth making explicitly:

    * Table variables are cached in tempdb exactly as for temporary tables, table-valued functions, and triggers. This includes the single data page and IAM page optimization.

    * Table variables can be declared with a PRIMARY KEY and UNIQUE constraints.

    * Only table variables can be use in an OUTPUT clause.

    * Sometimes the lack of statistics on table variables is desirable to produce a good 'FAST 1' plan - without the goal-seeking behaviour of TOP or FAST (n).

    I could go on.

    It really isn't on to say that temporary tables are better than temporary variables, or indeed that UDFs are undesirable. It is perfectly possible to misuse just about any feature in SQL Server - each and every one has its strengths and weaknesses. Appropriate use is the name of the game: choosing the right tool for the job.

    Paul

    One correction to the above regarding table variables and the OUTPUT clause, from BOL:

    output_table

    Specifies a table that the returned rows are inserted into instead of being returned to the caller. output_table may be a temporary table.

  • Lynn Pettis (8/10/2009)


    mtassin (8/10/2009)


    I'm sure she wasn't hinting at you. I am pretty sure I have an idea or two who she was thinking about as she wrote this particular post.

    I'm glad to think so. The guys around here tend to think highly of me when it comes to SQL... but then I get to come here and read what those I look up to post and I realize that even after 12 years of this stuff... there's much more to learn 🙂

    I certainly wasn't. I didn't even notice your post count at the time. If I had, I'd have picked a different arbitrary number. :blush:

    There's a small group here who have scores in excess of 1000, some in excess of 1500, purely from asking questions and followup in threads that they started (and maybe some QotD)

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • I have added a note to the article to read the discussion that follows.

  • As far as the editorial policy, feel free to debate it here: http://www.sqlservercentral.com/Forums/Topic768655-83-1.aspx

  • mtassin (8/11/2009)


    Lynn Pettis (8/10/2009)


    mtassin (8/10/2009)


    I'm sure she wasn't hinting at you. I am pretty sure I have an idea or two who she was thinking about as she wrote this particular post.

    I'm glad to think so. The guys around here tend to think highly of me when it comes to SQL... but then I get to come here and read what those I look up to post and I realize that even after 12 years of this stuff... there's much more to learn 🙂

    Heh... I have to agree with you. This site has taught me quite a lot in the last four years.

Viewing 15 posts - 136 through 150 (of 244 total)

You must be logged in to reply to this topic. Login to reply