Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 

Data Warehouse Development: Version 0

By Andy Leonard,

Introduction

I have wanted to write a series on Data Warehouse Development for a long time. I don't know what took me so long to start! In this article I am going to discuss an approach to starting a Data Warehouse project – a way that helps you gather information about the technical infrastructure while establishing your value to the customer.

Don't Boil the Ocean

I first heard "Don't boil the ocean" from Craig Utley and I wholeheartedly agree with his definition and description. Instead of working on a smallish piece of the data warehouse solution, many take the All-Or-Nothing approach.

Most Boil-the-Ocean practitioners practice waterfall project management methodologies. Waterfall methodologies are step-based, sequential methodologies to building a data warehouse. They start with Requirements Gathering, traipse through Design, Development, and Deployment, and then end with Maintenance. Data warehouses simply do not lend themselves to waterfall methodologies; they’re better suited for iterative development.

I take a straightforward and simple approach. The first part of this approach is something I call...

Version 0

Version 0 delivers a single report. I talk to a stakeholder or CxO and ask "What do you use to measure how your business is doing?" Usually the answer involves numbers. Those numbers come from somewhere. You would be amazed at how many numbers are copied from one Excel spreadsheet into another Excel spreadsheet in the 21st century.

I track those numbers down, searching until I find their source. Along the way I collect the rules associated with those numbers. Numbers sometimes change as they move from spreadsheet to spreadsheet. Sometimes the change is an aggregation; sometimes it's attenuated in some form, or flattened into some range or band.

The key is to keep going until you get as much information as possible on these numbers. You want to know where the data originated. You want to identify as many rules as possible. You next want to build a mechanism to move the data from the source to some other location. Your goal is to create a report that is visible to the stakeholder or CxO. Moving the data from the source to another location is called ETL – for Extract, Transform, and Load. I like Integration Services (SSIS) for this job but I’ve used SQL queries and stored procedures as well.

This process will usually take no more than a few weeks. Ideally the entire process will take less than a few weeks, but you want to allow yourself time to learn the environments. By “environments” I am talking about the servers and network topology, but I’m also talking about the relationships you need to build with people, which are vital to your success on this project.

Stuff You Pick Up

Along the way, you learn a bunch about the rest of the project. You learn:

  • who to ask about data;
  • the location of data sources; 
  • how to connect to data sources;
  • data- and access-mapping strategies;
  • where you can land copies of the data for reporting;
  • how to connect to reporting sources; and
  • how to deliver reports.

That's a lot of information about infrastructure that will serve you well throughout the project life cycle. Plus - and this is vital to your success on the project - you demonstrate to the customer you can deliver.

Conclusion

Version 0 is an important deliverable in a data warehouse project that sets the stage for the rest of your development work.

:{>

Total article views: 8859 | Views in the last 30 days: 15
 
Related Articles
ARTICLE

Track source dates when loading a data warehouse

A primer on how to reduce network and source system load when reading a relational source into the d...

FORUM

Data Source Project Object

Using data source project object to set "server name" for connection manager

BLOG

SSIS - Data Provider Does Not Allow Parameters in OLE DB Source

I recently worked on a project that involved loading a Data Warehouse from a DB2 source.  In this pr...

BLOG

SSIS - Data Provider Does Not Allow Parameters in OLE DB Source

I recently worked on a project that involved loading a Data Warehouse from a DB2 source.  In this pr...

ARTICLE

Are you ready for Data Warehouses?

Data warehousing is being used more and more everyday and longtime data warehouse DBA Janet Wong bri...

Tags
 
Contribute

Join the most active online SQL Server Community

SQL knowledge, delivered daily, free:

Email address:  

You make SSC a better place

As a member of SQLServerCentral, you get free access to loads of fresh content: thousands of articles and SQL scripts, a library of free eBooks, a weekly database news roundup, a great Q & A platform… And it’s our huge, buzzing community of SQL Server Professionals that makes it such a success.

Join us!

Steve Jones
Editor, SQLServerCentral.com

Already a member? Jump in:

Email address:   Password:   Remember me: Forgotten your password?
Steve Jones