SQL Clone
SQLServerCentral is supported by Redgate
Log in  ::  Register  ::  Not logged in

DBA Hypochondria

By Tony Davis,

SQL Server, like other server systems, has become such a complex machine that it is beginning to take on some of the attributes of an android. When DBAs discuss ailing machines, something of a medical air affects the conversation. There is talk of symptoms, diagnostics, cures, viruses, leakage, dumps, and other concepts borrowed from human health.

The same errors that sometimes afflict the medical profession affect DBAs too. It is tempting to collect server symptoms randomly, in the forlorn hope that the more they collect, the clearer a picture they'll assemble of what ails the server. They should, of course, be looking at the whole patient (hang on; I think I meant 'server').

"Doctor, come quickly! Buffer Cache Hit Ratio is way down; Disk Queue Length is through the roof, Page Life Expectancy has dipped below the magic '300'. On with the white coat, alerts are going off all over the place…the patient is in terrible shape; should we call the relatives?"

In fact, of course, there is a good chance the 'patient' is just fine. Examining metrics such as these is the equivalent of the eighteenth century doctor examining only the patient's bowel movements, regardless of the reported symptoms. The chances of a correct diagnosis or, indeed, of discerning whether anything at all is wrong with the patient, beyond their diet, are slim.

Some metrics are more-or-less meaningless (BCHR). Others seem inextricably linked to 'magic thresholds' that ceased to be accurate many years ago (PLE). Many more lose meaning when applied with a uniform threshold value, across all environments. As Adam Machanic noted is his 10 commandments of Monitoring, the only good universally applicable threshold value is "server power is on". All others have a significance that is frustratingly ambiguous. Not only won't such metrics help you find the real problem, they will also distract and annoy you every time you receive an alert that a threshold is breached.

What's required in SQL Server diagnosis is a more formal approach based on good medical diagnosis where the whole patient is assessed; lifestyle, stress, baseline activity . Rather than randomly inserting thermometers and extracting blood, a doctor will sit the patient down, get a feel for their overall health, check their medical history, formulate a possible diagnosis and perform specific tests to prove or disprove it. They look for symptoms that stray significantly from "normal", for the patient as well as humanity in general.

Likewise, for each of your SQL Server environments, establishing what is "normal" is the key, and the cure for DBA hypochondria. In other words, you need means baselines, and ideally some statistical algorithms for analyzing the data over time and spotting significant trends amongst the "noise".

Who owns up to being a DBA hypochondriac? Or wants to share their "baseline formula" cure?


Total article views: 107 | Views in the last 30 days: 1
Related Articles

The Secure Medical Data Challenge

This week Steve looks at the challenges of security for medical data.


Taking A Medical Approach to Performance Troubleshooting

Taking a Medical Approach to Performance Troubleshooting I find it difficult sometimes to explain t...


unique patients from duplicate patient row entries

I have one table "patients"...one row per patient. Patient.ID, Patient.Name, Patient.address, ...


Medical IT

Hello there. I have a rather broad question I would like to pose.   I am in the medical field and ...


Healthcare / Medical sample database?

Does anyone know of an "AdventureWorks-esque" sample database for the heatlhcare, health insurance o...

database weekly