My DW Strategie

  • Ive had some great help from post in the last few weeks as i began my first ever DW project. Im currently reading as much as i can on the subject to speed up my development but i would be great full for any feedback as to if my strategie below is acceptable good practise for project im working on.

    1) I have a StagingDW and mainDW database on the same server.

    2) I have built stored procs on our transaction database to pull the data for customers, orders, garments etc. More or less one proc for each dim, fact.

    3) Using ssis i call the stored procs and extract the raw data into my raw tables in my staging database.

    4) Once my raw tables are populated i perform validation on them to remove nulls etc.

    5) After validation i then move the data to my Staging tables in my Staging database.

    6) Once my staging tables are populated i perform any transformation on the staging data.

    7) After transformation I then start to Load into my DW.

    8) Clear tables>Drop Index>Populate Dims>Build Fact>Add Index's

    As i said this is my first DW project and really enjoining it but conscious i may not be following best practises.

    *Im doing full loads on everything just now as the data set is small enough but i expect that to change.

  • Do you apply transformations on the staging tables themselves? (aka using update statements)

    Or are you using views between staging and the data warehouse?

    Aside from the full load, it seems you're following best practices on a high level.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • Yes im updating the staging tables directly, should i be??. this project really is a proof of concept to get the business invested into a DW and cube solution im very aware that further down the line this will have to be re-developed to handle incremental but time constraints and data set size dont permit this just now.

  • I'd rather use a view.

    If you update the staging tables themselves, you lose their prior values, which might make debugging a bit harder.

    Sometimes it boils down to personal preference though.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • Yes i thought about that which was my thinking with having the raw tables and the staging tables. for example to build the DimCustomer i pull the transactional data into my RawCustomer tables which i then never update. I then Pull the customer data over into my StagingCustomer table and this is where i do my updating etc. I never thought about using views tho.

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply