Writing Better T-SQL: Top-Down Design May Not be the Best Choice

  • Comments posted to this topic are about the item Writing Better T-SQL: Top-Down Design May Not be the Best Choice

    Mickey Stuewe
    Sr Database Developer
    My blog
    Follow me on twitter: @SQLMickey
    Connect with me on LinkedIn
    --------------------------------------------------------------------------
    I laugh loudly and I laugh often. Just ask anyone who knows me.

  • That's fascinating, thank you. I come from an application developer background and top-down design was always the best approach - especially when trying to read back what it is that the code is trying to do.

    Good comments are mandatory when trying to be more efficient, so I'll make double sure I've done so in any new code I implement!

    Thanks again. 🙂

  • Nice article.

    What I found most interesting about this was the remark:

    For most environments, having faster stored procedures and reducing any potential CPU pressure is more important than the number of reads.

    Because it's opposite of what I've come to believe - database systems are more likely to be IO bound than CPU bound so if something takes a little more CPU time but reads less, it'll be generally better to use that method because CPU cycle are cheap but disk access aren't.

    Am I mistaken to think that?

    On the other hand, the overall duration, however, is significant, since a stored procedure that takes half the time to run is much less likely to create lock contentions than other one that takes more time. I was once given the advice that the first and foremost measure is the wall clock -- I was making the mistake of making the execution plans my first measure and it was explained to me that most of time, something that executes in short time with a ugly execution plan is usually "better" than something that has a beautiful plan shape but takes more time.

  • Yes, it's not until you start to work in the real world that you find out 'elegant' is less important than 'efficient'. 🙂

  • Was thinking along the same lines för quite a time.

    But: the optimizer is not an angel,

    and in practice it is often much faster to use an intermediate result stored in a temp tabe and not a sub-query, or to use two simple update passes on a table in place of one more complicated.Found this out the hard way.

  • For 15 years I was against splitting the code into the parts until started working with hundreds of millions of the records. And what is the interesting part, that in many cases (even after rebuild statistics) execution plans are different between what you see in SSMS and what you have in reality, when SP is executed. Especially when you have heavy sub-queries.

  • While this is a good example and it's a good article (thank you SQLMickey for taking the time to write it), it IS an example of 1. "Divide'n'Conquer" methods frequently ARE the answer and having single queries that do a lot of things frequently ARE a major performance problem.

    Also, the article suggests the use recursive CTEs. I want to remind everyone that certain types of recursive CTEs are frequently worse than either Cursors or While loops. Please see the following article for proof of that.

    http://www.sqlservercentral.com/articles/T-SQL/74118/

    As with all else in SQL Server, "It Depends". Keep both methods in mind because they both have their places.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • why SPs 2 and 4 run faster than 1 and 3 was not so clear

  • Nice thought-provoking article SQLMickey. Here's a thought for you:

    "Let’s take a look at a stored procedure query written four different ways."

    Whilst a junior developer might select one of the four without considering the others, an experienced developer will try all four and compare metrics. An informed choice is always better than a guess.

    Also:

    •Running small queries and inserting the results in temp tables to be used later in the stored procedure. One execution plan will be created and executed for each of these.

    Sometimes this is more efficient than integrating the table source into the main query, and in any case, the “extra” execution plan won’t disappear if you integrate the table source into the main query – it will become part of a bigger and more complex execution plan, running the risk of an optimiser timeout. As above, an experienced developer will test, and compare metrics.

    •Running small queries and inserting a single value in a variable to be used later in the stored procedure. One execution plan will be created and executed for each of these.

    I’ll trade an “extra” execution plan for performance every time. The query optimiser doesn’t always get it right and when it doesn’t, temporary storage structures are usually the tool to use to help it along. Divide'n'Conquer, as Jeff says above.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • The article should have emphasized the "small" . When the the data sets get either large or the query is complex you do really need a temp table.

    And to agree with previous comments, you need to test your queries, one size never fits all.

  • very good post! I am a big fan of CTEs and CROSS/OUTER APPLY but like many said above sometimes temp tables make the code more efficient and maintainable, so experienced developers will test a few different methods.

    Also a lot of times I am ready to sacrifice some performance to make code more readable and maintainable - in the end of the day, if your SP does some heavy ETL at night off business hours, no one care if it runs for 2 hours or 1 hour and 40 minutes 🙂

  • Great article, making me think of why I do what I do and that maybe there is a better approach.

  • redtux1 (6/2/2015)


    The article should have emphasized the "small" . When the the data sets get either large or the query is complex you do really need a temp table.

    And to agree with previous comments, you need to test your queries, one size never fits all.

    Yes I agree with redtux1, this will work on small set of data

  • Simple Question. How are the stats for the different queries acquired?

  • Credibility is job one for me, so being able to see intermediate result sets helps me validate that the right data is flowing though my code. Obviously it has to be fast too, so one choice is never always right. I've used CTE's in SSIS while performing some ETL work and had a "simple" series of steps take 1.5 hours to run, changed the CTE to a stored proc using temp tables and called it from SSIS - it completed in about 2 minutes. Took awhile to find that "right" solution! Also, I like the ease of documenting and reading code that is in blocks. I'm not the most experienced with complicated or tricky code so maybe my thoughts will change as my SQL skills continue to improve.

Viewing 15 posts - 1 through 15 (of 57 total)

You must be logged in to reply to this topic. Login to reply