Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase ««123»»

Data Warehouse Development: Version 0 Expand / Collapse
Author
Message
Posted Thursday, April 7, 2011 11:53 AM
Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Thursday, May 8, 2014 10:08 PM
Points: 358, Visits: 397
I think the first assertion needs a slight modification; a first version DW should produce one reporting subject area, not just one report. From there, it's a question of how many reports are needed, which may be only one but could be several.

So the question to the CxO becomes 'About what subject in our business do you need better information?' Typical answers include 'sales', 'inventory', 'customers'. Then we can ask what in particular about that subject is needed (sales across systems, inventory across systems, customers across time) as a genesis for some potential reports (plural) from that starter DW.

The single report question can set wild expectations and backfire into a monster project request that has to be scaled back, which is why I'm wary of it.

Asking up front for potentially anything sets a high first expectation that may need dramatic scaling down. Let's say the answer from CxO is "I really need a combined sales report from systems A, B, C, D, E and F". CxO smiles, having an instant vision of such a report. Good luck bashing that out in a few weeks, so right off the bat DW person has to tell CxO we can only do systems A and B to start. CxO frowns at the delay.

Now imagine asking about what subject first. CxO says, "I need better insight into total sales because I have to look at 5 reports in different formats." CxO frowns at thinking of the hassle. DW person: "I can make a version 1 DW for you in a month that has a unified sales report from systems A and B and we can add the other systems every month or two after that depending on their complexity." CxO smiles.

Meanwhile, a subject area DW should be able to immediately serve other needs. If the one report method is strictly followed then some small incremental data Y that is involved in the subject area in general but not needed for the report would be excluded. Then when VP1 hears about CxO's upcoming new report and says, "I need exactly that with Y added to it" DW person can either say 'we can put that in the report writers' queue because the data is already on the way' (VP and CxO smile) or 'I'll have to see about adding that in" (VP frowns at the delay).

The subject area method can take 10% to 20% more time but the ability to provide N reports instead of 1 provides N*100% more worth and visibility.
Post #1090110
Posted Thursday, April 7, 2011 12:14 PM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Friday, June 13, 2014 3:47 PM
Points: 14, Visits: 110
magarity kerns (4/7/2011)
I think the first assertion needs a slight modification; a first version DW should produce one reporting subject area, not just one report. From there, it's a question of how many reports are needed, which may be only one but could be several.

I totally agree with you. Start with one subject area, not one report. If you begin with one subject area and use the Kimball four step process, (1) identify the business process, (2) explicitly state the grain or level of detail required, (3) identify the dimensions or business actors acting in that process, and (4) identify the metrics used in the analysis of that business process, you're well on your way to success. Plus, your building the blocks for additional subject areas to be integrated by sharing conformed dimensions. If you're guided by a report, you're much more likely to build a data silo that will be much more difficult to extend and maintain.
Post #1090133
Posted Thursday, April 7, 2011 12:41 PM


SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Yesterday @ 3:29 PM
Points: 1,669, Visits: 4,769
Martin Mason (4/7/2011)
magarity kerns (4/7/2011)
I think the first assertion needs a slight modification; a first version DW should produce one reporting subject area, not just one report. From there, it's a question of how many reports are needed, which may be only one but could be several.

I totally agree with you. Start with one subject area, not one report. If you begin with one subject area and use the Kimball four step process, (1) identify the business process, (2) explicitly state the grain or level of detail required, (3) identify the dimensions or business actors acting in that process, and (4) identify the metrics used in the analysis of that business process, you're well on your way to success. Plus, your building the blocks for additional subject areas to be integrated by sharing conformed dimensions. If you're guided by a report, you're much more likely to build a data silo that will be much more difficult to extend and maintain.

The problem is that in most cases the business doesn't start out by saying "we need a datawarehouse, so let's put our heads together and figure out the best way to go about doing that"
The initital build of the datawarehouse typically stems from a request coming from uppermost management for an enterprise level report or dashboard that shows them X, Y, Z and the IT department just sort of figures out at some point (hopefully before the first deployment) that a datawarehouse of some type would be essential.
Not unless they bring in an outside consultant, or at least one person on the project is up to speed on database architecture, would the initial phase of development ever involve a discussion of Kimball, Inmon, OLAP cubes, or even data replication.
Post #1090165
Posted Thursday, April 7, 2011 5:02 PM


Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Wednesday, August 27, 2014 6:49 PM
Points: 388, Visits: 1,042
Eric M Russell (4/7/2011)
The initital build of the datawarehouse typically stems from a request coming from uppermost management for an enterprise level report or dashboard that shows them X, Y, Z and the IT department just sort of figures out at some point (hopefully before the first deployment) that a datawarehouse of some type would be essential.
Not unless they bring in an outside consultant, or at least one person on the project is up to speed on database architecture, would the initial phase of development ever involve a discussion of Kimball, Inmon, OLAP cubes, or even data replication.


Hi Eric,

I concur. I've seen customers purchase a reporting solution - usually an expensive one at that - and believe they have purchased a business intelligence solution. My guess is they spoke with a salesperson first, and then no one else after. If a business wants to implement a successful data warehouse, I believe they should speak to Data Warehouse or Business Intelligence Architect first. A good BI/DW architect is going to meet with the business people and the IT Department.

Magarity and Martin:

I mentioned Craig Utley in the article and his deference to "boiling the ocean." I agree with him and resist any attempts to increase the scope of Version 0. I've seen the larger projects fail many times. This Version 0 approach hasn't failed yet.

:{>


Andy Leonard
CSO, Linchpin People
Follow me on Twitter: @AndyLeonard
Post #1090312
Posted Thursday, April 7, 2011 5:08 PM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Friday, October 5, 2012 3:30 AM
Points: 138, Visits: 351
Most of the responses here are about what I believe Andy would see as the Version 1. But he talks about the step BEFORE diving in too deep into this. His approach is an exercise about getting to know the business. And using a single report as a guide/goal to help focusing the people you talk to. This one report will probably be thrown away when version 1 comes along which includes the whole subject area and allows for obtaining the same report again (but implemented in a more solid way).
The big thing for version 0 is to get to know the environment in which the data warehouse is going to operate (where are the original data-sources, how reliable is the data-input, who is fudging data on the way to the top, who will be obstructing any change, who will sponsor the project, etc, etc). Delivering the actual version 0 report is just a bonus to show your merits to management, but the real deliverable will be YOUR knowledge about the business inner workings, which will increase your chances of success for version 1 many-fold.
Post #1090318
Posted Thursday, April 7, 2011 10:07 PM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Friday, June 13, 2014 3:47 PM
Points: 14, Visits: 110
I think we're in agreement. If you need to convince executive management of the need for a data warehouse, "Version 0" would serve a vital purpose. There's a right way and wrong way though. The right way, if multiple business subject areas are included in that report, is to scale down the ultimate report to a small time range or a small subset from one or two data sources. And build the dimensional model correctly. The wrong way is to construct a spreadsheet model and the nightware that that ETL will become. Come in after the wrong way was done and was warned never to ever say "data warehouse" to executive management if I wanted to stay employed. If the second approach is used, the result is likely to become worse than if no attempt was ever attempted. In other words, a solid architecture is essential. If you swerve from it, your chances of success are greatly diminished. Looking forward to the next article.
Post #1090369
Posted Thursday, April 7, 2011 11:52 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Thursday, December 15, 2011 6:49 PM
Points: 2, Visits: 9
We are finding that PowerPivot is an excellent tool for performing this 'discovery phase'. Not only does it unearth the data sources its also gives you a heads-up on data quality and potential master data management issues which tend to throw a spanner in many BI projects.

As a BA it allows you to get a far better understanding of the business process and how the data is used to support it, and it allows you to literally sit and design reports with the business users and get their buy in - user adoption is key to maximizing ROI. It is often said that business users don't really know what they want and that with BI projects requirements change very rapidly post-deployment once users actually get their hands on the data and the tools to analyse it. PowerPivot allow us to deliver useful results and built a good rapport with the stakeholders very quickly.
Post #1090400
Posted Friday, April 8, 2011 7:05 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Tuesday, August 26, 2014 6:12 PM
Points: 1, Visits: 54
Great. Next?
Post #1090583
Posted Sunday, April 10, 2011 2:28 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Tuesday, August 13, 2013 12:38 PM
Points: 6, Visits: 65
Thank you firstly for finally writing this DW article.

I believe that you must include in version 0 the slow methods of flattening the data in a preprocessed way and one or two aggregated methods of looking at the report data, (analysis reports). Although the stake holder doesn't explicitly ask for the additional methods it assists you in understanding the data in ways that you would find interesting. You essentially become even more engaged. While creating the aggregates you will invariably create faster, more efficient and elegant way of flattening and pre-processing.

As a programmer, the scripts I generate initially use many procedural lines and use many temp tables. While debugging I can use the temp tables to see how the data is being massaged. Later this gets condensed, some become CTE's. Lately I have also moved the filters to CASE statements as columns with 1/0 results. The advantages for both methods is that I can rerun the scripts repeatedly, daily etc. The CASE test columns allow me to see the results as a pie of sorts, the filtered and the unfiltered, quickly.

Version 1 has the above scripts morphed as an sp, params are refined, and the proc is Compared to a SSIS/ETL. Also Version 1 underlying base tables are expanded, to include aggregate/calc columns, in an effort to solve or produce additional reports that you have now gathered are needed. Then wash rinse and repeat.
Post #1091191
Posted Sunday, April 10, 2011 7:50 PM
SSChasing Mays

SSChasing MaysSSChasing MaysSSChasing MaysSSChasing MaysSSChasing MaysSSChasing MaysSSChasing MaysSSChasing Mays

Group: General Forum Members
Last Login: Tuesday, December 3, 2013 4:40 PM
Points: 654, Visits: 375
I'd like to add one point to your great article. DW ROI is not proportionate to the development time invested.
Some business areas can provide high ROI, with a short development time. Having some idea of how the business operates, I like to approach these high ROI areas first - the outcome leads to a DW selling point to the wider company.

Also, for goodness sake, if the company decides to purchase a logical model like the IBM retail model, it's important that they understand that it isn't a functional DW. I'm not sure how IBM sell some of their models, but at least one company I've worked with has had the expectation that the retail model is a bolt-on solution.
Post #1091211
« Prev Topic | Next Topic »

Add to briefcase ««123»»

Permissions Expand / Collapse