Data Warehouse Development: Version 0

  • I think we're in agreement. If you need to convince executive management of the need for a data warehouse, "Version 0" would serve a vital purpose. There's a right way and wrong way though. The right way, if multiple business subject areas are included in that report, is to scale down the ultimate report to a small time range or a small subset from one or two data sources. And build the dimensional model correctly. The wrong way is to construct a spreadsheet model and the nightware that that ETL will become. Come in after the wrong way was done and was warned never to ever say "data warehouse" to executive management if I wanted to stay employed. If the second approach is used, the result is likely to become worse than if no attempt was ever attempted. In other words, a solid architecture is essential. If you swerve from it, your chances of success are greatly diminished. Looking forward to the next article.

  • We are finding that PowerPivot is an excellent tool for performing this 'discovery phase'. Not only does it unearth the data sources its also gives you a heads-up on data quality and potential master data management issues which tend to throw a spanner in many BI projects.

    As a BA it allows you to get a far better understanding of the business process and how the data is used to support it, and it allows you to literally sit and design reports with the business users and get their buy in - user adoption is key to maximizing ROI. It is often said that business users don't really know what they want and that with BI projects requirements change very rapidly post-deployment once users actually get their hands on the data and the tools to analyse it. PowerPivot allow us to deliver useful results and built a good rapport with the stakeholders very quickly.

  • Great. Next?

  • Thank you firstly for finally writing this DW article.

    I believe that you must include in version 0 the slow methods of flattening the data in a preprocessed way and one or two aggregated methods of looking at the report data, (analysis reports). Although the stake holder doesn't explicitly ask for the additional methods it assists you in understanding the data in ways that you would find interesting. You essentially become even more engaged. While creating the aggregates you will invariably create faster, more efficient and elegant way of flattening and pre-processing.

    As a programmer, the scripts I generate initially use many procedural lines and use many temp tables. While debugging I can use the temp tables to see how the data is being massaged. Later this gets condensed, some become CTE's. Lately I have also moved the filters to CASE statements as columns with 1/0 results. The advantages for both methods is that I can rerun the scripts repeatedly, daily etc. The CASE test columns allow me to see the results as a pie of sorts, the filtered and the unfiltered, quickly.

    Version 1 has the above scripts morphed as an sp, params are refined, and the proc is Compared to a SSIS/ETL. Also Version 1 underlying base tables are expanded, to include aggregate/calc columns, in an effort to solve or produce additional reports that you have now gathered are needed. Then wash rinse and repeat.

  • I'd like to add one point to your great article. DW ROI is not proportionate to the development time invested.

    Some business areas can provide high ROI, with a short development time. Having some idea of how the business operates, I like to approach these high ROI areas first - the outcome leads to a DW selling point to the wider company.

    Also, for goodness sake, if the company decides to purchase a logical model like the IBM retail model, it's important that they understand that it isn't a functional DW. I'm not sure how IBM sell some of their models, but at least one company I've worked with has had the expectation that the retail model is a bolt-on solution.

  • Nice article, thanks Andy. I can't wait to see the rest of the series.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • The intent of Version 0 is to show what can be achieved. A taster with a promise of jam tomorrow. My worry is that executive thinking is "if they developed this with in no time and no resource then I can get a lot for very little outlay".

    As in an earlier post there has to be careful management of expectations.

    I've had a proof-of-concept backfire on me spectacularly where most of the people involved agreed that the POC had failed in almost every tangible way but mission critical functionality had found its way into the POC so it went ahead anyway.

    I am also extremely wary of Excel. I think I've seen it described as a Spreadmart! On one hand the business get great utility and early pay back. On the other hand they've driven a very large truck down a very tight cul-de-sac.

  • David.Poole (4/11/2011)


    The intent of Version 0 is to show what can be achieved. A taster with a promise of jam tomorrow. My worry is that executive thinking is "if they developed this with in no time and no resource then I can get a lot for very little outlay".

    As in an earlier post there has to be careful management of expectations.

    I've had a proof-of-concept backfire on me spectacularly where most of the people involved agreed that the POC had failed in almost every tangible way but mission critical functionality had found its way into the POC so it went ahead anyway.

    I am also extremely wary of Excel. I think I've seen it described as a Spreadmart! On one hand the business get great utility and early pay back. On the other hand they've driven a very large truck down a very tight cul-de-sac.

    Excel is a fine business intelligence tool, so long as it's used for reporting or pivoting data from the Data Warehouse. If your team includes a business analyst who is also sharp with Excel, then the database developers can focus on the heavy lifting of creating the ETL processes and data model for the warehouse while the BA stays focussed on the client and handles all of their 1001 requests.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Good Day Andy,

    Must say I think I like your work, simple and methodical!!! Version 0 is a very critical step that many Developers/Managers tend to ignore, they dive staight into the deep-end without the critical work of preparation. I'm glad you highlight it's importance.

    It seems to me that you will be using the Kimball Methodology, as you mention "Data warehouses simply do not lend themselves to waterfall methodologies; they’re better suited for iterative development", to develop your DW System. Is this the case?!

    Looking forward to your next Piece of Art!!!

    A keen follower of your work on the Stairway to Integration Services!!!

    Regards,

    Aquila Hanise

    BI Developer

    South Africa

  • Nice Article, Andy... thanks. You know the process you describe is the process you should go through to scope any project... not just DW! Look forward to reading more. Hey why not suggest that we can subscribe to any artile you write? The search on the site leaves a lot to be desired, in part becouse there is so much out here!

Viewing 10 posts - 16 through 24 (of 24 total)

You must be logged in to reply to this topic. Login to reply