One great promise of data warehousing over the years has been a "single view of the truth" for companies. By moving, cleaning, transforming, and standardizing data from other systems, we can put together a single location for the most accurate data a company can have. I suspect there are some organizations that have had success here, but many are struggling. The idea behind some of the newer SSIS tasks, Master Data Services (MDS), and Data Quality Services (DQS) in SQL Server is that we have some functions in the SQL Server platform to make this easier to achieve. Or perhaps to ensure we do so at a high level of success.
However just moving data to a central location isn't necessarily the only way to deal with the challenges of data. Perhaps there's a better way, a more distributed way that provides a framework for centralization, but distributes the ownership and knowledge of the data to others. Buck Woody recently wrote about data hubs as a project and idea that Microsoft is making available. It's a place to publish data for your organization, but groups inside your company, but available for others to use.
Many of us have experienced the issues of each developer or each department attempting to manage their own sources and lookup data. Even well known data such as postal codes can change and easily become stale quickly. How many developers are willing to write import routines to update this data for something such as postal codes, let alone internal data such as customer names.
Ideally I think we should pull lots of our initial data from corporate sources, and establish central data hubs for all new types of information we gather and support. From there, a variety of import and update routines could be written and shared by all applications people use. It wouldn't provide perfection in terms of data quality and freshness, but it would be better than allowing each individual developer to make their own decisions.
The Voice of the DBA Podcasts
We publish three versions of the podcast each day for you to enjoy.
The podcast feeds are available at sqlservercentral.mevio.com. Comments are definitely appreciated and wanted, and you can get feeds from there. Overall RSS Feed: or now on iTunes!
Today's podcast features music by Everyday Jones. No relation, but I stumbled on to them and really like the music. Support this great duo at www.everydayjones.com.
You can also follow Steve Jones on Twitter: