SQL Clone
SQLServerCentral is supported by Redgate
Log in  ::  Register  ::  Not logged in

Slacking and data quality (in that order)

Well, I almost missed blogging for the entire month of June.  I'm sure that this fact didn't go unnoticed by both of the people who read my blog...  I'm working on a major data conversion and am in a mad dash to finish converting and validating years of healthcare and financial data, and unfortunately my free time (including the time allocated for blogging) has been scarce.  The good news is that the project - at least the data conversion piece - will be over in late September and perhaps life will return to some semblance of normalcy.

The aforementioned project has been an interesting exercise in data quality.  The system from which I am extracting data is quite old, in technology years anyway, and the application design lacks some of the keystones of modern systems - not the least of which is relational integrity.  The de facto standard for data entry was free text, which made for many (in some cases, tens of thousands) of duplicates.  Fortunately, the system to which I am converting has a well designed SQL Server backend, and in spite of a few disagreements, the vendor has been open to modifying the system to suite or needs.  As to the quality of our data, I've had lots of opportunities to expand my SSIS skills to gently (most of the time) massage the data into the target system.  I've even been able to write some code, which I don't do that much any more, for some advanced text parsing and manipulation.

Once this project is complete, I'll write a more comprehensive - and coherent - post to discuss in more detail my travels through this conversion and some of the data quality lessons I've learned.

Tim Mitchell

Tim Mitchell is a business intelligence consultant, author, trainer, and Microsoft Data Platform MVP with over thirteen years of data management experience. He is the founder and principal of Tyleris Data Solutions.

Tim has spoken at international and local events including the SQL PASS Summit, SQLBits, SQL Connections, along with dozens of tech fests, code camps, and SQL Saturday events. He is coauthor of the book SSIS Design Patterns, and is a contributing author on MVP Deep Dives 2.

You can visit his website and blog at TimMitchell.net or follow him on Twitter at @Tim_Mitchell.


No comments.

Leave a Comment

Please register or log in to leave a comment.