Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 12»»

Using ODS As A Middle Layer Expand / Collapse
Author
Message
Posted Saturday, March 25, 2006 10:33 AM


SSC Journeyman

SSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC Journeyman

Group: General Forum Members
Last Login: Monday, January 14, 2013 4:24 AM
Points: 81, Visits: 188
Comments posted to this topic are about the content posted at http://www.sqlservercentral.com/columnists/vRainardi/usingodsasamiddlelayer.asp
Post #268457
Posted Tuesday, April 11, 2006 8:18 AM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Wednesday, December 11, 2013 3:04 AM
Points: 210, Visits: 115

I'm not sure I entirely understand your argument for an ODS in this context. The only difference between that and the warehouse is that the ODS may not encompass the whole enterprise. Assuming that you would still want to have an enterprise-wide data warehouse, the ODS in this context seems to be redundant. I guess what I'm asking is how is it not a data warehouse

I'm also confused about what operational functions (updates etc) you would want to apply in the ODS. Surely any updates should be applied back at the source system.

Lastly if the data marts are being fed from the ODS (as illustrated in the diagram) then what is the data warehouse used for. Shouldn't the data warehouse be used as the one holy sacred supplier of all data for reporting?

Sorry I don't mean to be so negative, data warehouse and data mart design is always very open to interpretation. Thanks for the article it has given me some new angles to consider.




Post #272474
Posted Saturday, April 22, 2006 3:32 PM


SSC Journeyman

SSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC Journeyman

Group: General Forum Members
Last Login: Monday, January 14, 2013 4:24 AM
Points: 81, Visits: 188

Hi Robert, thank you for your comments.

The ODS is not a data warehouse because ODS is in the third normal form, not in dimensional model. But yes you are right, ODS contains all the data in the enterprise, and data warehouse database also contains all the data in the enterprise. Because of the amount of data the ODS contains is the same as data warehouse, yes in this sense ODS is redundant. But the benefits I mentioned in the articles outweigh the cost of this redundancy.

About what operational functions (updates) that you would like to apply in the ODS, and why this updates are not applied back in the source system, please allow me to illustrate this case with an example. I hope it will clarify the situation. Referring to the diagram in the article, let's say for example you have 3 ERP systems: JDE, SAP and a propriatary system (custom developed in your company) called CGL. Say you have a business performance metric called strike rate, which is computed in the ODS, based on the information from all 3 ERP systems. This strike rate is by service centre, by service type and daily. So for example, for Birmingham, for Fiscal Year 2006 week 24, the strike rate for ServiceType1 are 93%, 92%, 96%, 95%, 91%, 94% (Monday to Saturday). And so on, we calculate the strike rate for every service centre, for every service type, for every day in the year. To calculate these percentages you need information not only from 1 ERP system, but from all 3.

Now, for this strike rate metric, your company also have strike rate target, which is by service centre, by service type. For example, for Birmingham, for ServiceType1, the strike rate is 95% whilst for Manchester, for ServiceType2, the strike rate is 92%. This information are not anywhere in the source system and we need to get it to the ODS. We could update the ODS directly, for example by building a web/windows .NET application. Or (this one is probably more preferable) by designing the new application to save the strike rate targets on a separate database, and we then ETL / import this information into staging, then into ODS.

About your last point, if the data marts are being fed from the ODS then what is the data warehouse used for. Yes we could feed the data marts from the data warehouse database (as the data warehouse database contains all the information on the ODS), but it is more efficient to feed them straight from ODS, especially if the data marts are fed daily.

Thank you for all your comments Roberts, they have been an eye opener to me too.

Kind regards,
Vincent

Post #274698
Posted Monday, April 24, 2006 2:10 AM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Wednesday, December 11, 2013 3:04 AM
Points: 210, Visits: 115

Thanks for you reply Vincent.

Rob.




Post #274759
Posted Tuesday, May 23, 2006 8:04 AM
SSC Eights!

SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!

Group: General Forum Members
Last Login: 2 days ago @ 8:20 AM
Points: 916, Visits: 993

Really good article but I must disagree with your comment that the ODS is not a data warehouse because it is in 3rd normal form, not in dimensional model.  Although storing data in a dimensional model is valid and may be the best model for supporting end-user reporting, it is not the end-all model for a data warehouse.  The only reason your ODS is not the definative data warehouse is because you said it can be updated.  This idea is not generally supported in the data warehouse... but as with anything there are exceptions to every rule i.e. Type-1 slowly changing dimensions.

 

I believe that different methods work for different environments and we must be flexible to real world environments yet frigid in staying with proven best practices... whether it's Kimball dimensional models, Inmons CIF and DW 2.0, or John Doe in Little Company USA's who made the data warehouse work within his real world limitations.  Business practices change, methodologies change... and who knows, in a couple of years we may have a new method for building data warehouse system... the Rainardi Method  

 

Anyway I digress, overall great article with a lot to think about and I'm looking forward to more.

 

 

 

 




Post #282082
Posted Monday, January 8, 2007 2:21 PM


SSC Journeyman

SSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC Journeyman

Group: General Forum Members
Last Login: Monday, January 14, 2013 4:24 AM
Points: 81, Visits: 188

Thanks David. Sometimes I wonder about that, why there are only 3 types of SCD, why don't we make a new type. Why only Kimball and Inmon method (and Barry Devlin), why don't we make a new method. Your question whether a data warehouse must be a dimensional has been in my mind for a long time now. I think dimensional is a better format to get the data out but normalised form is a better format for integration. I think real time ETL is really hot at the moment and it breaks the fundamental principles that a DW is a non-volatile store. Another trend is text analytics which developed very quickly in the last year or so which I think will be a hot potato in this 2007. What do you think?

Regards,
Vincent

Post #335191
Posted Friday, April 13, 2007 7:51 AM
Say Hey Kid

Say Hey KidSay Hey KidSay Hey KidSay Hey KidSay Hey KidSay Hey KidSay Hey KidSay Hey Kid

Group: General Forum Members
Last Login: Tuesday, October 22, 2013 11:18 AM
Points: 676, Visits: 432

Vincent,

With regards to the 3 types of SCD, I don't see another alternative to handling updates in a table. Either you 1) Replace the data, 2) Version the data in a new row, or 3) Version the columns that change in the same row. If there is another technique you were thinking of, I would love to hear it. I'm always open to hearing new ideas that may help out in the future.

For the warehouse here, we find that using the Kimball design for the warehouse as well as the 2 ODS's we have works just fine. Our reasoning for using an ODS vs putting the data in the warehouse was time based. We had limited time to get data into a storable format and add to that 90% of the data is junk in one particular column that is used for reference information. Using the ODS buys us time to analyze the junk so we can eventually create some rules in ETL to cleanse the data and land it in the DWH.

 

Cliff




Post #358199
Posted Tuesday, January 1, 2008 2:14 AM


SSC Journeyman

SSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC Journeyman

Group: General Forum Members
Last Login: Monday, January 14, 2013 4:24 AM
Points: 81, Visits: 188
Thank you for sharing your experience about ODS with us Cliff.
With regards to the SCD beyond the 3 types Margy Ross and Ralph Kimball describe several alternatives and combination (hybrid) types in this article. The one I was thinking of last year is storing the historical dimensional data in another table(s), using the technique similar to the one used for 'auditing', i.e. the structure of the history table is the same as the dimension table, with some additional columns such as change timestamps, flag columns (such as change type and latest), version columns and user information columns. The alternative of this technique is storing only the necessary columns of the dimension table, rather than all columns. In the case of a 'snow-flaked' or 'outtrigger' dimension, we have the options of combining the historical dimensional data in 1 history table, or mimic-ing the structure of the 'snow-flaked' dimension tables.
I apologise I've just seen your post today.
Kind regards,
Vincent
Post #437707
Posted Thursday, October 23, 2008 3:41 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Monday, May 3, 2010 9:35 PM
Points: 3, Visits: 14
Hi Vincent,

Thanks for the enriching discussion about
whether To impart the ODS as middle layer.

I am pretty excited to know how we can
Implement an ODS in an environment where
both the source and target databases are
from Relational SQLSERVER 2005.

If you would provide me DO's and DONT'S
of implementation methodology with a
Suitable example especially dealing with the
"INCREMENTAL LOAD" will be greatly helpful to my task.

Thanks,
Vijay
Post #590844
Posted Friday, October 24, 2008 12:16 AM


SSC Journeyman

SSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC Journeyman

Group: General Forum Members
Last Login: Monday, January 14, 2013 4:24 AM
Points: 81, Visits: 188
Hi Vijay,
there has been a lot of things that I have learnt since I wrote that ODS article and last year I wrote a data warehousing book, http://www.amazon.co.uk/dp/1590599314/
My current view is that we don't have to have an ODS to build a data warehouse. So regarding the do and don'ts, the top of the list would probably be "only build the ODS if you need one". The dimensional data warehouse should contain the data at the lowest level/grain. But in practice this is not always the case. 95% of the case it is applicable, but there are certain cases where it's not practical and we had to source operational lower level reporting from ODS.
The second DO is probably the grain, i.e. must be the same as the source business system, i.e. should be the lowest possible level, don't aggregate.
3rd DO: should be in 3rd (or more) normal form. It shouldn't be in 2nd or 1st. And it definitely shouldn't be in dimensional model.
4th DO: have a internal ODS key/ID, rather than relying on the source system key.
5th DO: do have a DQ (Data Quality), i.e. create data firewall between the source system and ODS
DON'T: don't store history, only keep the current version. If we need to do snapshoting and SCD, it is better to do this on the dimensional Data Warehouse.
DON'T: don't copy the structure of the source system as is, but organize into proper 3rd NF.

For incremental load is we can use identity column or timestamp column. I wrote the incremental loading here: http://www.sqlservercentral.com/articles/Design/dataextractionmethodspart1/2356/

Please don't hesitate to come back with any question, I'll be glad to assist.

Regards,
Vincent

PS. You don't have to follow the above do and don't as it is, your situation could be different. So they are just for your consideration. Good luck with your project.
Post #590988
« Prev Topic | Next Topic »

Add to briefcase 12»»

Permissions Expand / Collapse