How to Design Data Staging Area in a Data Warehouse

  • HI

    I am a trainee in Business Intelligence, I was asked to list the Best Practices of Designing a Data Staging Area. The Best Practices including the Architecture, Development and Maintenance. What infrastructure is needed and should it be on the same database as the Data Warehouse.

  • It depends, usually on the complexity of the data and volume.

    I personally take the approach of Landing data as is from the Source system, with maybe a few meta data columns added for future use

    Then transform it through to a Pre-load area such that its in roughly the same structures as the Dims and Facts, with only the Surrogate keys to be assigned as part of the load process.

    Finally Land the data in the Dim/Fact tables with the surrogate keys. this way the Pre-load area acts as a contract layer to the DW.

    As for architecture, if you have a number of source systems then you may want an ODS to conform the data as part of a pre-load, if its a single source then you would omit this stage.

    Server wise, most of the grunt work is going to be done when transforming the Staging data into the Pre-load interface, after that its mainly lookups for keys, inserts for new data, and updates to existing data.

    Most DW load processes should run on a single server, unless they are very large in which case you start to consider separate servers, for the various tiers, such as Staging and ODS on one server, and Warehouse on a second server.

    _________________________________________________________________________
    SSC Guide to Posting and Best Practices

Viewing 2 posts - 1 through 1 (of 1 total)

You must be logged in to reply to this topic. Login to reply