Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase ««12

Using ODS As A Middle Layer Expand / Collapse
Author
Message
Posted Monday, October 27, 2008 9:07 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Monday, May 3, 2010 9:35 PM
Points: 3, Visits: 14
Hi Vincent,

Thanks for the response,
Information you have provided would give
A better direction and soild focus in dealing with
Building of an ODS.

would n't you mind I require some more clarity
About those DO's and Dont's.

(1) 4 th DO:
Would you provide me information about how to define
ODS Key/ID with an "Example"?

(2) 5 th DO:

Would you provide me information about how to define
"Data Fire Wall" between Source system and ODS with
an "Example"?

If would assist me with relavent information that
would helpful for my project.

Regards,
Vijay

Post #592178
Posted Monday, October 27, 2008 10:15 AM


SSC Journeyman

SSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC Journeyman

Group: General Forum Members
Last Login: Friday, September 26, 2014 12:56 AM
Points: 81, Visits: 190
Hi Vijay, the ODS key is a surrogate key, i.e. sequential integer beginning with 0. For example: 0, 1, 2, 3, ... etc.

Data firewall is a collection of data quality rules in the ETL system. This is industry specific. For example, in an insurance ODS "if the capital adequacy ratio is less than certain limit, raise a warning". Or in a CRM ODS (CDI, Customer Data Integration), "if the email address is invalid, then flag it so that no campaign is sent"

Regards,
Vincent
Post #592231
Posted Sunday, November 30, 2008 6:40 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Monday, January 23, 2012 7:47 AM
Points: 2, Visits: 26
Hi Vincent,

I read the discussion from the first posts, and i can tell you that your first article about ODS confused me a lot.
I read a lot of articles about ODS and Data warehouse. I also worked on several DW/BI projects and till now i never had to deal with ODS, we put every detail in the data warehouse, sometimes in 3 Normal form and others in Dimensional Form...

Concerning ODS, term which was coined by Inmon and Imhof, had to serve tactical decisions rather then strategic ones. That's good, but when i read kimball's article about ODS, i understood that it's only question of the periodicity of the refreshment of data... So since ETL technology are moving from batch mode to near realtime mode, i don't see any need to still have an ODS.

What do you think about this ?

By the way, i just had your book, i will read it... at first view it seems to be very interesting


Abdel.
Post #610859
Posted Tuesday, December 2, 2008 3:14 AM


SSC Journeyman

SSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC Journeyman

Group: General Forum Members
Last Login: Friday, September 26, 2014 12:56 AM
Points: 81, Visits: 190
You are right Abdel, with micro-batch or near real time ETL, data warehouses now have new functions. It’s now used not only for analysis, but also for operational reporting and supporting daily business operations, which reduces/eliminates the needs for an ODS.

And you are right, we should only create ODS when we need one. ODS has associated additional costs both in development effort and operational overhead: more ETL processes to execute, longer loading times. If until today you have not dealt with ODS, that’s good — chances are you don’t need it. Many companies don’t use ODS; they only have a dimensional data warehouse.

The reasons why some companies use ODS are: (these vary from company to company)
1) The dimensional data warehouse is not updated frequently enough, so they created ODS which is refreshed more frequently. For example, the dimensional DW is updated once a day whilst the ODS is updated 4 times during working hours.
2) The dimensional data warehouse does not store data at the lowest level grain. It stores historical data at aggregate level, for example daily snapshots of account balances for the last 5 years. Therefore they created an ODS which stores data at the lowest level, for example the current version of all transactions.
3) The dimensional data warehouse is in denormalized structure and they need to run some report from a normalized data structure. So they created ODS which is in normalized data structure to support these reports.
4) To support operational and regulatory reporting, they need to maintain group-wide data such as standing data and operational targets. These are not available in the source systems as the source systems are company-wide (the group consists of several companies with different business systems). The ODS is not only used on read-only basis for reporting but also for maintaining this data. An example of such data is reporting structure and data conversion elements.
5) Integration. The ODS tables are very similar to their business systems, but it contains data from multiple business systems, and it’s better designed (it’s newer, more structured, more complete, more normalized, more stringent naming conventions). Hence for some reports they prefer to report from ODS then from individual business system or from the dimensional data warehouse.
6) Performance. Although the same data is available on the dimensional data warehouse, because of its normalized structure some reports execute faster when they are pointed to the ODS. The fact that the ODS stores only the latest/current version (no history) also helps the report performance, compared to the dimensional data warehouse which stores all versions and years of snapshot data. Of course the other way around is also true: some reports execute faster when they are pointed to the dimensional data warehouse instead of ODS.

About the term "dimensional data warehouse" above: I prefer to call it “dimensional data store” or DDS. Where as the term "data warehouse system" encompasses the whole thing: stage, ETL, ODS, DDS, DQ, control, audit, cubes and the front-end applications, as I described on this article: Business Intelligence or Data Warehouse and in my book.
On the other hand, the term "data warehouse database" refers to either a relational DDS (as I mentioned here in this post) or a multidimensional data store (OLAP cubes). The data warehouse databases are shown in blue in the above article.

I appreciate that you may have different views Abdel and I am interested to learn from you. So I'd be grateful if you would post your views.

Kind regards,
Vincent Rainardi
Post #611921
Posted Thursday, December 4, 2008 12:41 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Monday, January 23, 2012 7:47 AM
Points: 2, Visits: 26
Hi Vincent,

Thank you for your answers.

One architecture that i usually use is described as follows :
- Extract the Transactional Data from Multiple Source Systems to a Staging Area. This Staging Area is not permanent. We extract Data with less transformations, and we store Data about the same subject from different systems in the same table (the structure of the table is a standardized to receive data from different systems, we also add some columns to differentiate source system...). The reason why we use a staging area is to disconnect as quickly as possible from the source systems in order to not disturb the operations.
- If it's possible We extract only delta Data from the source systems to the staging area.
- Transform the data and load it in the data warehouse (usually in 3NF), this Data warehouse allow us to track history over time, so we can provide a picture of our activity at any time in the past ( This is very useful in the insurance field). We try to never update the Data Warehouse tables, even if some times we can, in collaboration with business users, decide that some kind of changes in the source system are not very interesting so we can update some transactions...
- For performance reasons, we create Datamarts ( a set of aggregated fact tables and dimensions), to use for Analysis. The best design in this stage is the dimensional form.
- To explain analysis, we have to access to the detail, which is stored in the Data Warehouse database, so we create reports on Data warehouse Database and allow access to them from Analysis developed using Aggregated Datamarts by passing parameters ( This reduces the response time and by using some techniques, we can force the users to drill down to the correct level to retrieve fewer data from the data warehouse database).
- And so on.

Why i never had to deal with ODS :
If there is a need to deal with operational reporting :
- At first we reduce the refreshment periodicity of the Data Warehouse ( We can go as far as the ETL can do)
- At the first load of the Data warehouse database, We also load separately all useful tables for operational reporting. That's what i call the current version of the transactions.
- for the following extractions, we UPSERT these tables from the staging area in parallel with the load of the Data Warehouse Database ( this Database store the current version and all the past versions)
- If for any reason, there is a need to reconstruct our Datamarts or our Current version, We can do it from the Data Warehouse Database.

Hope this gives you an idea on how i do Data warehousing... i don't know if it's good or not, but till now the only problem i encountered is Data explosion .

As you can imagine, almost 70% of the effort to build a good DW/BI Solution is consumed by the ETL...

Abdel
Post #614021
Posted Sunday, December 7, 2008 2:22 AM


SSC Journeyman

SSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC Journeyman

Group: General Forum Members
Last Login: Friday, September 26, 2014 12:56 AM
Points: 81, Visits: 190
Thank you for sharing your view Abdel.
Kind regards,
Vincent
Post #615240
Posted Thursday, May 14, 2009 12:33 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Friday, October 11, 2013 8:52 AM
Points: 1, Visits: 44
Hi Vincent!!

Long time ago but it's a good article!!!

Thanks for cleany explanation!!

Cheers!
Post #717276
« Prev Topic | Next Topic »

Add to briefcase ««12

Permissions Expand / Collapse