• crmitchell - Thursday, October 19, 2017 10:23 AM

    Jeff,

    True enough. Reports and third party software are probably the most common situations where you have to deal with formatting in the database.
    Even in those cases it may be feasible to organise the SQL code such that you only need to format once your data manipulation is completed and you have selected a much smaller set of data which should help to mitigate the performance issue.

    Your example shows well how an alternative approach can perform better although I would argue that the FORMAT option is clearer in its intent than the better performing CONVERT option which makes it easier to support, particularly where the support team may be less knowledgeable.
    I would probably stick to using FORMAT where the number of dates requiring formatting is small e.g. 1 date on each of half a dozen rows but where there are large numbers e.g. 200 page report (does anyone actually read those?) with multiple dates on every row or something that gets run very frequently then performance matters and your CONVERT alternative would clearly be better. For cases between these examples which would be better will depend on the specific circumstances.

    You're absolutely correct when you say...

    ...organise the SQL code such that you only need to format once your data manipulation is completed and you have selected a much smaller set of data which should help to mitigate the performance issue.

    When I teach T-SQL methods that do require some formatting for the output, I tell folks that they still need to keep the Data and Presentation layers separate, especially since formatting can be such a load on the code.  Just like you say above, do the heavy lifting and aggregations first, which also usually greatly reduces the amount of data that has to go through any type of formatting including even something as simple as pivoting the data.  Although both a CROSS TAB or the PIVOT operator requires aggregation, it's a rather expensive aggregation because of the decision required to determine which column things must go into.  With that, I do what Peter "PESO" Larsson calls "pre-aggregation", which first aggregates the data to the smaller result set and then do the CROSS TAB or PIVOT on that greatly reduced data. 

    On the continued used of FORMAT... I'll never use it so long as it has such terrible performance even if the code is more clear.  My feeling on all of that can be summed up by two points...
    1.  I won't use something that's got a built in performance problem because performance and resource usage are second only to accuracy and it's a very close second.
    2.  If someone doesn't know how to use CONVERT, they should do a bit of studying.  It's a part of their job to do so.  It's like saying that I shouldn't use a Tally Table (or equivalent function) and should use While Loops instead simply because someone might not understand what it does.  A well placed comment in the code can also help there, as well.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)