Writing Better T-SQL: Top-Down Design May Not be the Best Choice – Part II

  • Comments posted to this topic are about the item Writing Better T-SQL: Top-Down Design May Not be the Best Choice – Part II

    Mickey Stuewe
    Sr Database Developer
    My blog
    Follow me on twitter: @SQLMickey
    Connect with me on LinkedIn
    --------------------------------------------------------------------------
    I laugh loudly and I laugh often. Just ask anyone who knows me.

  • This is an excellent illustration of a REAL SQL pro. While we were seldom officially permitted time for this kind of in-depth analysis, it 'separates the men from the boys'. (Sorry Ladies). Reminds me of the ancient COBOL days when lots of people could write it, but very few understood enough to write it well.

    Rick
    Disaster Recovery = Backup ( Backup ( Your Backup ) )

  • This is an excellent and helpful article, though I have to ask when you wrote "There are several issues when you first look at this view and some that are not so obvious", did you purposefully ignore the bad spelling of the View's name itself?

    StoreSlaesPerosnInformation_01

    Presumably you are amending an existing View whose name can't be changed?

    Although the brain can decipher the proper name, good spelling always wins IMHO!

    🙂

  • Good article.

    I seem to get dragged into the same argument over time about whether to use select table.* to form a generic query reusable across reports or to create purpose-built queries. I've always been cautions and maybe overly so with views since they hide things that can hurt you. You give good examples of where/why to think more before choosing.

    Btw the link to the series is broken in the second post.

    Thanks,

    Ron

  • Do the number of columns in the definition of the view really matter from a performance perspective? I thought the columns returned from a view only matter when the view is queried.

  • If the number of rows returned by each view is different, then they are not functionally equivalent.

    And this renders discussion about relative performance largely moot.

    I would be far more interested in a performance comparison where the modular approach is at least functionally equivalent.

    Even then it could suffer unnecessary performance loss due to poorly written inbetween views.

  • Smokie, you read my mind. 😉 Such discrepancies usually don't result in any negative impact to the bottom line, but they could drive you nuts if you're debugging your code and you overlook them in the process because your mind continually autocorrects the spelling especially upon cursory reviews. As a programmer, losing time on such details is such a waste.

  • Agreed, sequelgarrett. It seems as though with that point, the author is assuming you will use

    SELECT *

    FROM [viewWithTooManyColumns]

    ...but if you instead use

    SELECT SomeColumnIReallyNeed, SomeOtherColumnIReallyNeed

    FROM [viewWithTooManyColumns]

    ...then the generated query plan will not be affected by the extra columns in the underlying view.

  • Interesting. I would put maintainable ahead of performance if forced to choose. This sometimes leads me to break up a view with 'too many' joins into a small number of views, each with a specific function that I can then combine or nest. I find this approach can help with understanding how the application works when I or someone else has to re-visit it in the future. So I don't think getting rid of nested views is always preferable.

    Of course, performance has to be acceptable, however that is defined. At a minimum it means the optimizer has to be able to come up with at least a good enough plan. We sometimes have to split a complex query into pieces just to allow that to happen.

  • skeleton567 (6/23/2015)


    This is an excellent illustration of a REAL SQL pro. While we were seldom officially permitted time for this kind of in-depth analysis, it 'separates the men from the boys'. (Sorry Ladies)....

    Hope you realize that the article author Mickey, is a lady too. 😛

  • craig 81366 (6/23/2015)


    If the number of rows returned by each view is different, then they are not functionally equivalent.

    And this renders discussion about relative performance largely moot.

    I would be far more interested in a performance comparison where the modular approach is at least functionally equivalent.

    Even then it could suffer unnecessary performance loss due to poorly written inbetween views.

    +1

    I actually believe on what Mickey points out in this article. We need to write better code and make other people do the same thing.

    However, she's comparing apples to oranges and that's not an appropriate comparison. Hopefully, we can have the results when comparing the query of the views as needed by the report to get the real improvements.

    Thank you Mickey, for taking your time on writing these articles.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • ron.lee (6/23/2015)


    Good article.

    I seem to get dragged into the same argument over time about whether to use select table.* to form a generic query reusable across reports or to create purpose-built queries. I've always been cautions and maybe overly so with views since they hide things that can hurt you. You give good examples of where/why to think more before choosing.

    Btw the link to the series is broken in the second post.

    Thanks,

    Ron

    Hi Ron,

    Thank you for letting me know about the broken link. I will have it fixed. Here is the working link: http://www.sqlservercentral.com/articles/T-SQL/126533/

    Mickey

    Mickey Stuewe
    Sr Database Developer
    My blog
    Follow me on twitter: @SQLMickey
    Connect with me on LinkedIn
    --------------------------------------------------------------------------
    I laugh loudly and I laugh often. Just ask anyone who knows me.

  • sequelgarrett (6/23/2015)


    Do the number of columns in the definition of the view really matter from a performance perspective? I thought the columns returned from a view only matter when the view is queried.

    Hi sequelgarrett,

    Yes, you are right. If you have a single view that returns all the fields in a table and someone queries that view and only selects 3 columns, then only the three columns are selected.

    The situations where it will matter, is when that same view is nested within another view, which nested four more levels down. Computations are created through the views, LEFT JOINS are used here and there. The final view only used 5 columns. There can still be quite a bit of work that was done by SQL Server and was never used. Plus, SQL Server is only given so much time to find a good plan. Wading through all the extra joins that weren't used, takes time and you might be left with a bad plan.

    Last week, I was working on a view that was nested 5 levels deep. I removed all the extra tables and fields that were not needed (after verifying no other SQL Objects needed those tables or fields). I also flattened the initial view and it performed 60% better.

    I hope that helps.

    Mickey

    Mickey Stuewe
    Sr Database Developer
    My blog
    Follow me on twitter: @SQLMickey
    Connect with me on LinkedIn
    --------------------------------------------------------------------------
    I laugh loudly and I laugh often. Just ask anyone who knows me.

  • stephen.kratowicz (6/23/2015)


    Agreed, sequelgarrett. It seems as though with that point, the author is assuming you will use

    SELECT *

    FROM [viewWithTooManyColumns]

    ...but if you instead use

    SELECT SomeColumnIReallyNeed, SomeOtherColumnIReallyNeed

    FROM [viewWithTooManyColumns]

    ...then the generated query plan will not be affected by the extra columns in the underlying view.

    Hi Stephen,

    Yes, your second select statement was the smart way to use the view, but I find that when people are in a hurry or they don't understand indexes, they use SELECT * instead. Even when it's not needed. Writing about the impact of SELECT * helps people who are still learning to understand what really happens.

    Mickey

    Mickey Stuewe
    Sr Database Developer
    My blog
    Follow me on twitter: @SQLMickey
    Connect with me on LinkedIn
    --------------------------------------------------------------------------
    I laugh loudly and I laugh often. Just ask anyone who knows me.

  • Hello!

    I like refactoring. Realy. Who doesn't ;)? But there is a downside:

    1. Time spend on writing/testing it.

    2. Loose of encapsulation. Too deep rewriting lead us to extra work later.

    And you should always estimate all pros and cons before you begin to code.

    Do the author of the article realy do better? I'dont think so:

    1. From the perspective of execution speed this view (IMO) doesn't need any rewriting - it's fast enough (0.7s is suitable for reporting). Or am I miss anything?

    2. Original sql-script is rather clear, it refer to two entities only. Resulting one contains constants and refer to 11 tables. What happend when the application logic will be changed? We will have extra work.

    I think, that it's normal to reuse existing db-objects. Of course, you should control the nesting level of usings and break too long chains. But some layers of abstraction help.

    I agree that only needed columns shoud be included in the result view. And it will be more than enough (IMO) in this case.

    Thank you!

Viewing 15 posts - 1 through 15 (of 25 total)

You must be logged in to reply to this topic. Login to reply