Guidance

  • Comments posted to this topic are about the item Guidance

  • I'm really not sure why benchmarking isn't done more often, before and also during a performance crisis.

  • I found that I had to offer guidance quite a lot: that's one of the things that happens when you get to be where the buck stops.

    Mostly though I could offer guidance like "you should look at such and such a document and see whether it offers something useful" or "maybe look at what is happening in this part of the syetem which may have some influence on your part" or "don't try to do that until you have learnt x, y and z" or "which of your options for fixing it are you certain won't make things worse, why don't you try those first" or "have you looked at all the data in the various error and event logs and alerts?"

    Telling people what to do when they are very junior and supposed to be learning is OK, but at that stage they shouldn't be getting into anything so complex that it's not clear what is right. Except in a few cases where it's know not to be clear and that's what they have to learn, and then you can tell them that and point out the trade-offs they have to evaluate - but those cases are rare at a very junior/trainee level.

    Offering guidance on forums like those here is much more fraught than advising people in-house, because generally we don't ctually know the situation of the OP and the OP doesn't understand that we don't know the background; that makes it easy to give bad advice, so we have to be very careful.

    Tom

  • Heh... Grant's article really strikes a note for me but not the way that most people would think.

    I'm truly amazed at how much time people spend looking at the different types of waits and trying to tweak hardware and a whole raft of esoteric settings to, in a frequently vain attempt, reduce resource usage and improve performance. Don't get me wrong. There ARE some rote settings changes that should be made on most servers but even those will only produce relatively minor improvements in performance and resource usage.

    Then there are the poor souls that scrap perfectly good hardware and buy the "the next generation" of hardware and go through the huge pain of migrating to the new server and, possibly, migrating to the latest version of SQL Server, only to find they've achieved little to no performance improvements. So, they spend more time and money upgrade their new hardware with SSD's and possibly invest heavily in 3rd party "solutions" only to be further frustrated (although I'll readily admit that SSD deployment in the right areas can really help in many cases).

    We see people that write solutions on forums that use things like Recursive CTEs (rCTE) to replace Tally Tables or Itzik Ben-Gan's fabulous Cascading Tally CTE (my name for it) or continue to post splitters based on While loops. We continue to see posts by "experts" that explain supposed high performance solutions accompanied by a While loop or rCTE that generate a whopping 10 rows of test data. We continue to see multiple calls on CTEs, advice on how to solve problems that use Scalar and Multi-Statement UDF's, and a whole gambit of RBAR solutions and other poor code choices.

    We have oodles of time to spend and the money to buy all sorts of hardware solutions to get performance "improvements" that are typically measured in trivial single digit or low two digit percentages but no time to spend on the real problem which can, many times, result in 3, 4, and sometimes even 5 digit percentages of improvement. Try buying hardware that will give you a 6,000% (60 times improvement). It's usually not possible through hardware even if your whole database actually does fit on an SSD.

    Getting back to the point of this article and to make my point…

    From the article


    The environment in which your application runs is complex and there are lots of variables that can affect the advice or guidance that an expert might recommend. It's the reason we see DBAs often saying "it depends."

    There is one variable in the equation where I never say "It Depends" in the guidance I try to offer and that is "Crap Code will cripple your server. Take the time to identify and fix it and then learn how to avoid writing it to begin with".

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (11/16/2013)


    There is one variable in the equation where I never say "It Depends" in the guidance I try to offer and that is "Crap Code will cripple your server. Take the time to identify and fix it and then learn how to avoid writing it to begin with".

    Yes, that's always good advice. But there are even more important things about crap code: often it will cripple your ability to debug/maintain/enhance your system - and a system that doesn't grow into new things is a system that is dead; especially when "actually working" is one of those new things.

    There's a nice example of crap code here. I haven't a clue whether it will cripple performance or not. I don't even know whether the crappiness is just layout or whether it's a blockheaded choice of join structure, But I know for certain that it already has crippled people's ability to understand what it does and why it doesn't work. The OP can't understand it (that's what he wants help with). Grant didn't understand it. I didn't understand it. So I wrote code to analyse one aspect of it (which is how I learnt that Grant's comment was wrong, hence that he didn't understand it). Everyone who has commented thinks it's crap code. I think writing code like that is a crime, an assault against all standard of decent code. But I'm not actually sure it's going to cause performance problems if it can ever be got to work.

    Tom

  • First, Tom, you are right about that code. I looked, I saw, I ran away! The mixing of Left, Right, and inner joins and then nesting them to boot? If I had code like that in my system I would be looking at rewriting it. Would have to anyway to figure out what it was trying to do.

    I agree with Jeff, before you go out and spend money on fancy new hardware, look at your code. There are probably things you can do to improve it without spending a fortune on hardware.

    I found a simple stored procedure in our system that is run periodically to delete orphaned records from our replication control tables. This procedure took 26 minutes to run. I have a rewrite that is set based instead of cursor based that runs in sub-second time (haven't timed it). Looking at the code in the procedure, I made a very minor change to the existing code. I added an N in front of each of the string constants and now that cursor based function runs sub-second. Just three N's and look at the change in performance.

    We see it all the time on the forums, people looking for things more than X days old, using DATEDIFF and comparing that value to X. Works great on a small data set, try that against several million rows of data.

    That is where I start when looking at performance issues. Once I have eliminated code, then start looking elsewhere for problems. Could be poor or missing indexes, or even stale statistics. If those are good, then keep pecking away at other possibilities.

    It is hard to give guidance, and yes it does seem that our favorite phrase is "It Depends."

  • Lynn Pettis (11/17/2013)


    We see it all the time on the forums, people looking for things more than X days old, using DATEDIFF and comparing that value to X. Works great on a small data set, try that against several million rows of data.

    I've had similar experiences. My most recent I've walked into an already running db/website. They had an audit history table that had built to about 80M rows and was replicated up and down into four other databases. It took about four hours to go day by day to move everything into a history table and delete older than 60 days from the current table. We saw about a 20% increase on the front end.

    Then the DB served multiple unrelated facilities/companies. The SP to look for authorized employees didn't have a facility qualifier on it. So it was looking through a list of over 40K current and former employees from every company that ever used it. We were working on correcting it when we were bought out.

    The free SQL monitoring tools helped spot some of this.

    But I thoroughly agree that you need to look the current before you just go for hardware.



    ----------------
    Jim P.

    A little bit of this and a little byte of that can cause bloatware.

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply