SQL Clone
SQLServerCentral is supported by Redgate
Log in  ::  Register  ::  Not logged in

The Numbers Don’t Lie… Except When They Do

There are few things more reassuring for a data professional than having clean, consistent data to back up critical business decisions.  The numbers don’t lie, or so they say.  But can the right data lead to wrong conclusions?  Sadly, yes, and I suspect that it happens more often than we’d like to admit.

Recently, as part of a large hospital project I’ve been working on, I’ve been addressing questions around the all-important census, the magic number of patients bedded in a given facility.  This facility converted from an outdated software package to a more modern, SQL-Server based product about a year ago, and one of the key goals with the new system was to “get the census right”.  See, the old system had at least a few dozen census reports, most of which were in disagreement with the others about the true count of in-house patients.  Because the software package was dated, the most common complaint was that “the system” was reporting incorrect information.  However, during my review of the archived reports, it quickly became clear that the reports from the old system were all correct.  The malfunction was not in the answers, but in the questions.

Speaking Different Languages
The root of these types of problems is consistently identifying the performance metrics of a business.  In our census example, one report might include only admitted patients, while another could include those still in triage in the ER.  One report shows the count of inpatients as of the previous midnight, while another provides the same information in real time.  In isolation, each of these metrics is correct, but when held side-by-side with the others, it appears that the output is wrong.  This is, in my experience, a trend on the rise as the movement toward self-service reporting continues to grow: more and more end users are querying their information systems than ever before, and many of them are basing critical decisions on loosely defined standards and definitions.

To avoid these assumptions and half-truths with data, I offer the following preferred practices:

  • Clearly define the metrics, entities, and standards that are critical to your business intelligence, and share them with all principals.  Often this involves answering what appear to be silly questions: “What is a day?”, “What is a patient?”, “What is a sale?”, “What is a billable hour?”, etc.  By clarifying these elemental questions, your downstream metrics will be improved because everyone understands what is and is not included in these definitions.  Once those terms are defined, be dogmatic in their use.
  • Involve all business units in those standards-setting conversations.  Regardless of your industry, you should include principals from each major facet of your business – sales, marketing, decision support, executives, customer service – to ensure not only a comprehensive understanding of the reporting needs but to also create a feeling of ownership in the process.  If they believe in it, they’ll support it.
  • Ask leading questions.  Don’t simply give everyone ad-hoc access to your raw data; use the technical tools available to enable managed self service reporting (security controls, data warehousing, denormalizing views, or more encompassing tools such as PowerPivot) to limit the open-endedness of most user queries.  The data should be versatile enough to answer the important business questions without creating a free-for-all where the results could be made to show almost anything.
  • Validate your output.  There should be a formal, mandatory validation process for each new output created, whether it’s a simple report or an entire volume of data.  Having a validation process that crosses business units is highly desirable, as this lends itself to more accurate and versatile (reusable) reports.  Part of this review process should be to confirm that the data retrieved is not already provided by existing outputs.

With the growing trend toward self-service reporting, it’s more important for everyone in a business to keep on the same page.  It’s still going to be a challenge, but by following these few practices, this process should be a little easier.

Tim Mitchell

Tim Mitchell is a business intelligence consultant, author, trainer, and Microsoft Data Platform MVP with over thirteen years of data management experience. He is the founder and principal of Tyleris Data Solutions.

Tim has spoken at international and local events including the SQL PASS Summit, SQLBits, SQL Connections, along with dozens of tech fests, code camps, and SQL Saturday events. He is coauthor of the book SSIS Design Patterns, and is a contributing author on MVP Deep Dives 2.

You can visit his website and blog at TimMitchell.net or follow him on Twitter at @Tim_Mitchell.


Posted by jcrawf02 on 10 December 2009

Very important concept, and I agree whole-heartedly. Especially in healthcare, where 'Inpatient', 'ER Services', 'Admitted', 'Encounter' can be ambiguous terms depending on who you are speaking to.

We're working on defining views within our reporting team so that we can all use the same metrics, rather than finding that the reports we're offering to different departments vary by a few dx codes (or whatever).

Leave a Comment

Please register or log in to leave a comment.