Using ODS As A Middle Layer

,

A traditional approach in data warehousing is to transform data from the stage

into a dimensional store (a.k.a. spatial store), i.e. in a star schema (or snow

flake) form. Ralph Kimball explained this in great detail in his book The

Data Warehouse Toolkit. A traditional approach in building an operational

data store (ODS) is to transform data from the stage (or straight from source)

into a third normal form data model and store it there. Inmon explained ODS

concept in his book

Building Operational Data Store. Stage is an area where we placed files

or tables from the source system temporarily, before we load them into either

ODS or data warehouse.

Some data warehouse implementations use ODS as a middle layer. This ODS

also takes the 3rd normal form, but unlike the original Inmon & Claudia's

concept, which states that ODS only stores data for a short period of time, this

new ODS contains all the history. So in this sense, one could argue that it is

not accurate to call this middle layer an ODS.

This ODS concept would be particularly useful when we have several business

systems. The new layer unites the business systems and builds a single source

of data feeding the data warehouse. Say we have 3 systems: system A, system

B and system C. And all 3 of them have customer tables. In the ODS we "overlay"

those 3 tables on top of each other. Not only it is easier to feed the data

warehouse but also the ODS in practice would becomes a new integrated business

system. Initially it might be a read only system, but it could be easily developed

to take updates, for example by creating maintenance modules. This could be

done in an SOA way in .NET, or we could utilise software such as

ironspeed.

Data warehouse contains data for the entire enterprise: finance, operation,

logistics, human resources, marketing, strategic planning, sales, purchasing,

etc. Data marts contains one specific subject area, for example: marketing.

I must stress here that the level of granularity in the data marts is not just

the roll up level (aggregates) but also contains the bottom/detail level,

just like the granularity of the data warehouse.

If we have several data marts in the enterprise, the ODS is particularly useful.

For example, we could build a finance mart complete with all its dimensions,

and have another mart for operational purpose. Both marts taking the data from

the ODS, and the enterprise warehouse is also taking data from the ODS.

Another example is a group of companies which has subsidiaries in several industries.

Each company needs to have their own enterprise warehouse but they also need

to have a group-wide warehouse. In this case both the smaller company warehouses

and the bigger group warehouse would be build from a group-wide ODS.

The rational of building the smaller warehouses directly from the ODS is two

fold: performance and consistency.

Flexibility is another distinct advantage of having ODS as a middle layer.

The data warehouse is not directly interfacing with the source system, so it

is naturally protected. It is a lot easier to rebuild the warehouse, whether

it is total rebuilt or only partially rebuilt. This is because everything is

maintained in the ODS. They keys, the facts, the history, etc. We could opt

to limit our operational warehouse to 12 months data only, and occasionally,

if some projects require further data history, we could build a 3 year warehouse

easily, just by changing 1 parameter in the control system.

The benefits do come at a price. As we have one more layer, the data warehouse

batch window increases. Whatever ETL tool we use, this means more code to maintain.

The metadata helps, but it still increases development time and maintenance

effort. It also increases disk space requirements. The size of the ODS is typically

not far from the size of the enterprise data warehouse, plus or minus 20%. Is it

worth it? It depends on the size of our warehouse, how many business systems

we have, whether there is a need of integration, and how many marts we have

to produce and maintain.

If we are talking about 3 fact tables and 10 dimensions, only 1 warehouse without

any marts, and from 1 source system, it is not worth it. If you have 3 source

systems (and therefore would appreciate any means of integration), you have 2

marts, and we are talking 10 fact tables with 50 dimensions, yes, it is well

worth it. Can we build the ODS layer later? Yes we can but it will be very costly.

We need to reposition the data firewall, the key generation, the metadata, just

to name a few. The work of adding an ODS layer could take almost the same effort

as the building the data warehouse in the first place. Having an ODS as a middle

layer is a decision that we need to make before we build the warehouse system.

Vincent Rainardi

4th March 2006

Rate

5 (3)

Share

Share

Rate

5 (3)