My Problem, or Yours?

  • One thing I have learnt is that it is quicker to solve the problem if all involved stop saying "it's not me". I have always had the attitude of look first and see if it is me and if not proof that it is not me. Especially with users. They don't like a person to say: "Oh but you did something wrong.". Instead I take the blame and I can go on with my life.

    :-PManie Verster
    Developer
    Johannesburg
    South Africa

    I can do all things through Christ who strengthens me. - Holy Bible
    I am a man of fixed and unbending principles, the first of which is to be flexible at all times. - Everett Mckinley Dirkson (Well, I am trying. - Manie Verster)

  • Once you eliminate the impossible, whatever remains, no matter how improbable, must be the truth.

    Ideally I'd like a script I can run to do as many smoke tests as I can think to do.  If you publish the tests you run and the results then both are up for scrutiny.  One day it might be you and that smoke test script might accelerate diagnosis.
    Think sp_blitz

    If a problem arises and you find that there are gaps in the smoke tests you run then you can now extend your scripts.  You are not saying "it's not me" you are saying "I have performed these checks and the checks have not indicated a problem".

    The worst scenario I have heard of is where an obscure Linux setting had a devastating effect on the stability of a database server.  Even with external support guys crawling all over it the cause wasn't found until days later when someone with a deep and historical knowledge of Linux pondered "It couldn't possibly be that could it"?  I can't remember what the setting was but it certainly not something that many Linux people would know about.

  • 2017pjq - Wednesday, August 9, 2017 5:17 PM

    Good write-up of a very common nuance of human behavior. Over a few decades at the coal-face you see this happen over and over, and it pervades companies of varying size and industry sectors.
    This attribute of humanity repeats when a generation of team members moves-on, when the lesson of the last major disaster is lost.

     It's an attribute that should go away as AI replaces the human element of our professions.
    Bring-on the AI!

    Cheer,
    Pete Q

    WOW, I hear you. I maintain a legacy app written maybe 10 years ago. Probably because its so old, it definitely falls outside of anything like good practices. When I came to work here 2 years ago it was dropped on my lap. I had to make a change to it for some new requirements. No documentation. But what's worse is there wasn't any code, either. All we had were the ASP files and the DLLs. And since they didn't use source control back then, things were really bad. Fortunately, they did do backups of developer's dev boxes, so going over 5 different backups I was finally able to piece the software back together again and make the necessary changes.

    I am glad that source control is now enforced and code is reliably saved.

    Kindest Regards, Rod Connect with me on LinkedIn.

  • One thing that I have encountered over the years is servers that don't have the latest patched applied, i.e. drivers, Windows service packs and hotfixes, and reasonably current SQL Server service packs and/or CU's. Not being current can cause SQL Server installs and upgrades to fail, and are also responsible for mysterious cluster and Availability Group failovers. I have been told that by Microsoft tech support more times than I care to remember (for a while, I was part of a team that did a lot of deployments). And when all else fails, check all of the cabling as mentioned in the article. Perhaps because of redundancy, a single bad connection can go undetected for a long time, causing all sorts of strange things along the way.

  • Issues with the server's cooling fan can cause the CPU to intermittently slow down or freeze, and the same for system processes like virus scanner or auto updates.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • I keep it real simple... I assume it's my problem until I can prove, usually with code or logs, that it isn't actually my problem.  As the old saying goes, "The buck stops here".

    I won't get into what I think of many 3rd party vendors when they tell us they won't even talk to us until we can prove that our indexes are de-fragmented.  That shows a pretty high degree of ignorance on their part but the real reason why I brought it up was because they immediately assumed it was OUR problem.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • I wrote an application years ago that loaded data from spreadsheets, stored it in a database, mashed it around a bit and then produced reports. It took around 5 minutes to complete the process for a spreadsheet and this was acceptable (those were the days). Then the department that used it moved from London - where the database server was - to Glasgow and the processing time ballooned to 45+ minutes and sometimes just hung indefinitely. We had weeks of back and forth between DBAs, Devs and network people. The application hadn't changed, the database hadn't changed, the network wasn't maxing out at any point. The only change was the location of the users. It turned out that the application was too chatty with the database; for each row in the spreadsheet, it was doing lookups to the database to enrich the data - opening and closing a connection each time. When the users were in the same building as the server, the connection latency was just about acceptable, but over a wider network it was a major problem.

    My point is; just because one element hasn't changed, it doesn't mean that it isn't the cause of the problem.

    Chris

Viewing 7 posts - 16 through 22 (of 22 total)

You must be logged in to reply to this topic. Login to reply