Writing Better T-SQL: Top-Down Design May Not be the Best Choice – Part II

  • same with tables : to my knowledge, tables included in a view, which are not needed for satisfying colunms and do not influence the number of rows returned (that is : a left join on the key of a a unique index) are not included in the query plan. This as a rule of thumb, but you may not rely on it.

  • Our biggest hangup was normalization. There is a thing as too much normalization. A little less normalization (and sometimes duplication) can go a long ways to making operation more efficient.

  • Rather than using views I have often used Table Value Functions to achieve modularization. I first write a lower level detail function that does the filtering and returns detailed results based on the parameters passed in. I use a scalar function that does no data access to implement complex filtering rules. You just pass all the required data into the function as parameters and return a flag that indicates if the criteria was met. Then I write a higher level Table Value Function that does summarization using the results of the first function. You can continue this for as many levels of summarization as you need. If you have reports that allow multiple levels of drill down this will insure that the detailed information will always match the summary information. I can't tell you how many times I have been asked to research why a summary report and a detail report did not match exactly. It almost always comes down to the filtering rules not being exactly the same in the different queries.

    This approach seems to give acceptable performance, provide consistent results, and allow for easy maintainability. If the filtering rules need to change that can be done in the scalar function and it makes sure the correct results get passed up through all the layers of summarization. This works pretty well for any tables where a table scan won't kill you. A good example would be a report with 10 optional filter criteria. If you put in no criteria you get back everything. This type of query is not likely to result in a well optimized query plan anyway. I would not recommend this for really huge tables. In that case a custom query with a specific where clause, at least some required criteria, and proper indexes is probably a better choice.

  • When it comes to views, I never join views to views.

    it's always better to just write the necessary SQL.

    Yes in the end it means more changes if the underlying tables or cardinality change, but I think it's worth the efficiency gain.

  • Stephen Randolph (6/29/2015)


    When it comes to views, I never join views to views.

    Amen

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • This is a very good article. How many times I've seen beautifully written and concise SQL that doesn't perform. As said, "our goal is not to write less SQL, but to write efficient SQL", and that is a valid statement in any computing language.

  • Your conclusion doesn’t match up with your experience in solving the problem at all. You found a view that was poorly written and did the research required to make it perform. You found that removing the scalar functions was preferrable because you didn’t need those fields in the view.

    Your solution did not “duplicate code”, you wrote new code.

    Code that is truly “duplicated” is a maintenance killer! It’s by far the main reason that systems fail today.

    The article was great until I reached your conclusion.

  • markbonafe (8/19/2016)


    Your solution did not “duplicate code”, you wrote new code.

    Code that is truly “duplicated” is a maintenance killer! It’s by far the main reason that systems fail today.

    Wrote new code, that in effect removed the "use" of duplicated code.

    You say that duplicated code is a maintenance killer. But most things in SQL Server are a trade-off. For your system and workload, what is more critical? Developer time? Or focused, highly efficient code? If your system and workload's demands are not so high, then you can accept the trade-off of more generalized, "duplicated" code that is not as focused & less efficient. But if your workload demands are significant, then focused code solutions are your trade-off to achieve that.

    markbonafe (8/19/2016)


    It’s by far the main reason that systems fail today.

    First, that's an interesting assertion. In my own experience, code is often not the true root cause of outright system failures. But we each have our own experiences to draw from.

    Second, what is more common - an application that outright fails due to defects, or an application that runs poorly & inefficiently because the underlying code is incapable of scaling?

    Where's the trade-off? Which is the more common scenario that you should prioritize?

  • Ok, my definition of a "system failure" is having different sections of a system (pages, windows, reports, etc.) that are supposedly displaying the same information but are reporting different results. In my experience, this is caused because code has been updated, but because the code is duplicated across several areas of the code base - not all the routines received the same update.

    You can call it a only bug if you want, but it's definitely a fail.

    You are correct that it's a balancing act. If you are developing a system that requires the most efficiency, then you have to do what's necessary to make the code as efficient as possible. Many, if not most, systems do not need to squeeze every nano-second to work properly. I tend to try and build systems that are easily maintained, especially when small changes or improvements are a matter of routine.

  • I totally agree about the duplication of code. Sometimes it is the best route.

  • I had a couple of comments on this article. In general I thought it was good, however, there's a couple of overlooked key points here I'd like to bring up.

    In issue 2, since this is a view we are working on, it's purpose is to be usable by multiple other stored procs or queries. For a view or table valued function, you DO want to return every useful column in its results. SQL Server doesn't store execution plans for a view or table valued function, they are stored within each stored proc or query that uses them and it is smart enough to choose a different plan depending on how they are used. If you don't include a useful column, then developers will do another join to get that information, and SQL Server will happily hit that same table again.

    A quick example, we have an inline table valued function that gets the current status of a field agent. The TVF includes a join to a code table to get the actual description for the status, but if that column isn't used, the execution plan shows it doesn't even do the work for the join!

    In issue 4, you try to avoid the scalar function issue by avoiding the data it returns, rewriting the query. Scalar functions are known for a number of bad issues, and it's probably best if it were replaced with a CROSS APPLY or TVF if possible.

    It would have been interesting to see the results in the chart at the end if the SELECT from the different views only included the common used columns, instead of doing a SELECT * from the 3 different views, which is really asking for 3 different resultsets not the same resultset implemented 3 ways.

Viewing 11 posts - 16 through 25 (of 25 total)

You must be logged in to reply to this topic. Login to reply