RE: Using ODS As A Middle Layer

SSC Enthusiast

Points: 126

December 4, 2008 at 12:41 pm

Hi Vincent,

Thank you for your answers.

One architecture that i usually use is described as follows :

- Extract the Transactional Data from Multiple Source Systems to a Staging Area. This Staging Area is not permanent. We extract Data with less transformations, and we store Data about the same subject from different systems in the same table (the structure of the table is a standardized to receive data from different systems, we also add some columns to differentiate source system...). The reason why we use a staging area is to disconnect as quickly as possible from the source systems in order to not disturb the operations.

- If it's possible We extract only delta Data from the source systems to the staging area.

- Transform the data and load it in the data warehouse (usually in 3NF), this Data warehouse allow us to track history over time, so we can provide a picture of our activity at any time in the past ( This is very useful in the insurance field). We try to never update the Data Warehouse tables, even if some times we can, in collaboration with business users, decide that some kind of changes in the source system are not very interesting so we can update some transactions...

- For performance reasons, we create Datamarts ( a set of aggregated fact tables and dimensions), to use for Analysis. The best design in this stage is the dimensional form.

- To explain analysis, we have to access to the detail, which is stored in the Data Warehouse database, so we create reports on Data warehouse Database and allow access to them from Analysis developed using Aggregated Datamarts by passing parameters ( This reduces the response time and by using some techniques, we can force the users to drill down to the correct level to retrieve fewer data from the data warehouse database).

- And so on.

Why i never had to deal with ODS :

If there is a need to deal with operational reporting :

- At first we reduce the refreshment periodicity of the Data Warehouse ( We can go as far as the ETL can do)

- At the first load of the Data warehouse database, We also load separately all useful tables for operational reporting. That's what i call the current version of the transactions.

- for the following extractions, we UPSERT these tables from the staging area in parallel with the load of the Data Warehouse Database ( this Database store the current version and all the past versions)

- If for any reason, there is a need to reconstruct our Datamarts or our Current version, We can do it from the Data Warehouse Database.

Hope this gives you an idea on how i do Data warehousing... i don't know if it's good or not, but till now the only problem i encountered is Data explosion :-).

As you can imagine, almost 70% of the effort to build a good DW/BI Solution is consumed by the ETL...

Abdel