Dig Out the Root Cause

  • Comments posted to this topic are about the item Dig Out the Root Cause

  • I am certain that as an industry that we reduce our productivity over the longer term due to ignoring root cause analysis or, sometimes more frustrating, ignoring the results of root cause analysis.

    Gaz

    -- Stop your grinnin' and drop your linen...they're everywhere!!!

  • In the "Understand the True Source of Problems" article:

    2 – NOLOCK is required on every query. It makes things run faster.

    Instead READ_UNCOMMITTED is suggested.

    At one place I worked at, the developers were told to modify all the SELECT statements to add NOLOCK to the queries.

    Oh well...

  • One problem you will encounter when digging into root causes is that your search will often need to cross departmental or even oganizational boundaries, especially in the realm of data, distributed systems, and service oriented architecture. Developers will implement work arounds, not because they don't know the actual root cause, but because the root cause originates outside their sphere of influence. That's why digging into root causes isn't just an engineering issue, it's just as much a political and executive management issue.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • It is amazing to me how people's career paths can be so different at times and then others are similar...I too started my IT career as a Novell Administrator at a Nuclear plant. To this day, I still carry will me the "questioning" attitude developed at my time there, sometimes to my benefit and other times "not so much"!

  • Eric M Russell (4/28/2015)


    One problem you will encounter when digging into root causes is that your search will often need to cross departmental or even oganizational boundaries, especially in the realm of data, distributed systems, and service oriented architecture. Developers will implement work arounds, not because they don't know the actual root cause, but because the root cause originates outside their sphere of influence. That's why digging into root causes isn't just an engineering issue, it's just as much a political and executive management issue.

    +1000

    So much truth in this. An organization that encourages cooperation with individuals who aren't in a department's sphere of influence is a benefit to the developers, the other office, and to the organization generally.

    I've come to find that the pervasiveness of work-arounds in an organization is a pretty reliable surrogate measure for the prevalence of silos and rules based on misinformation or ego.

    For example, when the false tourniquet of "heightened security" cuts off the bloodflow to legitimate work, people whose jobs depend on executing that work will simply devise less secure measures.

    Rich

  • Though there is truth in what you say, there is also truth in that "good enough" has to exist or the company you are working for goes out of business because you take too long to do anything, taking way too long to "research" your problem. What probably happens is that they replace you with someone that can produce for them. We can argue against this all we want but businesses do collapse if we don't produce. There has to be a balance. Extremism on either side spells disaster in one form or another.

  • Ralph Hightower (4/28/2015)


    ...At one place I worked at, the developers were told to modify all the SELECT statements to add NOLOCK to the queries...

    Same here. It is a standards requirement at somewhere I worked (not the current place) which deals in financial transactions. I raised my concerns but I was treated with disdain.

    Gaz

    -- Stop your grinnin' and drop your linen...they're everywhere!!!

  • Gary Varga (4/28/2015)


    I am certain that as an industry that we reduce our productivity over the longer term due to ignoring root cause analysis or, sometimes more frustrating, ignoring the results of root cause analysis.

    sadly I agree. Especially as I find plenty of applications that seem to have lived far beyond what anyone planned.

  • Ralph Hightower (4/28/2015)


    In the "Understand the True Source of Problems" article:

    2 – NOLOCK is required on every query. It makes things run faster.

    Instead READ_UNCOMMITTED is suggested.

    At one place I worked at, the developers were told to modify all the SELECT statements to add NOLOCK to the queries.

    Oh well...

    Yep. What do you do here? It's stunning how often this is the case.

  • rustman (4/28/2015)


    It is amazing to me how people's career paths can be so different at times and then others are similar...I too started my IT career as a Novell Administrator at a Nuclear plant. To this day, I still carry will me the "questioning" attitude developed at my time there, sometimes to my benefit and other times "not so much"!

    Which one? I did Surry and North Anna in VA

  • Also the mindset to not own the issue upfront !

    I have seen few support teams simply 'soft-denying' to own up the issue or even offer a decent investigation. Due to this a superficial check is done and maybe a workaround implied and issue passed on to different team or simply mark as closed; till the time the issue returns back as escalated.

    In defense, some can cite the reason of volume, or time constraints for not doing a thorough check, but then doing RCA is actually a good time investment worth several returns ! (experience, knowledge, minimized risk, lesser P1/P2 calls, sound sleep 🙂 )

  • Steve Jones - SSC Editor (4/28/2015)


    Ralph Hightower (4/28/2015)


    In the "Understand the True Source of Problems" article:

    2 – NOLOCK is required on every query. It makes things run faster.

    Instead READ_UNCOMMITTED is suggested.

    At one place I worked at, the developers were told to modify all the SELECT statements to add NOLOCK to the queries.

    Oh well...

    Yep. What do you do here? It's stunning how often this is the case.

    Like Gary Varga, I worked for a business in the financial industry.

  • Steve Jones - SSC Editor (4/28/2015)


    Ralph Hightower (4/28/2015)


    In the "Understand the True Source of Problems" article:

    2 – NOLOCK is required on every query. It makes things run faster.

    Instead READ_UNCOMMITTED is suggested.

    At one place I worked at, the developers were told to modify all the SELECT statements to add NOLOCK to the queries.

    Oh well...

    Yep. What do you do here? It's stunning how often this is the case.

    This is why the person who calls the shots regarding database architecture and development really needs to be someone who "gets databases".

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Iwas Bornready (4/28/2015)


    Though there is truth in what you say, there is also truth in that "good enough" has to exist or the company you are working for goes out of business because you take too long to do anything, taking way too long to "research" your problem. What probably happens is that they replace you with someone that can produce for them. We can argue against this all we want but businesses do collapse if we don't produce. There has to be a balance. Extremism on either side spells disaster in one form or another.

    While I agree extremism can cause disasters, and there may be times when "good enough" is just that, I have to say that in my experience (banking, utilities, insurance), taking the time to do root cause analysis (RCA) saves money in the long run.

    For example:

    Business unit: "This report is wrong! The client is upset! We look like dorks!"

    Option 1: "Fix" the report. That works until the next time it's wrong and we look like dorks.

    Option 2: Root cause analysis shows the report isn't the culprit - it's really bad data being imported into your source system from a third party.

    Now you have some additional options. You can apply edits to incoming data and weed out the bad stuff, giving a report of errors to the business unit. And they can go back to the third party, and maybe reduce the number of bad records coming at ya. Then of course there's the cleanup of the bad data in the source system. Bonus: no more (ok, really it's reduced) bad data in the source system. And hey, no more looking like a dork! 😎

    The key to reducing cost is follow-up to deal with the root cause. If there's no political will to do that, then RCA is a waste of time. While it may seem simpler (or "good enough" ) to put edits in the report, it's really just setting the business up for failure as the source data becomes more and more corrupted, and the report code grows more complex until it's near impossible to keep it looking correct (which it really isn't at this point).

    The short version of this is, do you want to do it once right, or do it wrong many times over and over and over? Add up the costs of each, and you'll likely find that doing it right once is much more cost effective over time. To do it right once, you'll need that RCA.


    Here there be dragons...,

    Steph Brown

Viewing 15 posts - 1 through 15 (of 21 total)

You must be logged in to reply to this topic. Login to reply