Staging DB -> Star Schema DB -> SSAS Cube?

  • I have to modify the design of a data mart I "inherited." Currently the big picture is this:

    ETL Staging DB (EDI, Excel, and miscellaneous text files are loaded here) -> Second Staging DB (incorporates ETL Staging DB and some MS Dynamics AX data) -> SSAS Cube

    I realize SSAS 2008 R2 (the version we use) does not require a relational star-schema source. However, wouldn't it be a "best practice" to replace that second staging DB (which does not use a star schema) with an interim DB that does use a star schema?

  • While I know it's theoretically possible to build your SSAS db from a normalized relational db, I've never done it (nor seen it). I've always used a star schema (with perhaps a little snow-flaking). I would say yes, this is the accepted best practice to base your SSAS db & cubes upon a star schema db.

    Just my opinion, YMMV.

    Rob

  • imani_technology (8/21/2012)


    I have to modify the design of a data mart I "inherited." Currently the big picture is this:

    ETL Staging DB (EDI, Excel, and miscellaneous text files are loaded here) -> Second Staging DB (incorporates ETL Staging DB and some MS Dynamics AX data) -> SSAS Cube

    I realize SSAS 2008 R2 (the version we use) does not require a relational star-schema source. However, wouldn't it be a "best practice" to replace that second staging DB (which does not use a star schema) with an interim DB that does use a star schema?

    In my mind there are several layers on a DSS, they are:

    1- The OLTP source system(s) <= The systems that support the operation of the business.

    2- The Staging area <= Where you put whatever you Extract from OLTP and eventually Transform it.

    3- The Datamart <= A dimensional (star-schema) structure where you Load the data sourced from OLTP

    4- The Delivery layer <= Structure from where you deliver data to the business.

    Cubes are a good example of a Delivery Layer but in some cases you will find user queries hitting the Datamart tables directy - in which case you have either a mix or just datamart based reporting.

    In this particular case, adding a star-schema datamart in the middle of your already working system sounds like a project on itsefl - be sure you don't shoot your own foot in the process of making the system structure "cuter".

    Any particular reason to do it? bad performance or so? How you planning to sell the project to the business?

    _____________________________________
    Pablo (Paul) Berzukov

    Author of Understanding Database Administration available at Amazon and other bookstores.

    Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
  • The current system is functional but that's about it. There was apparently very little thought put into design of the current system. As a result, the system is hard to maintain and harder to extend the functionality. The users begrudgingly use the system, but don't embrace it.

    I would like the new system to actually have user input. And I would like for the system to follow business processes based on that input. I also want the new system to be maintainable and I would the system to support new functions such as new dimensions in the future.

    PaulB-TheOneAndOnly (8/22/2012)


    imani_technology (8/21/2012)


    I have to modify the design of a data mart I "inherited." Currently the big picture is this:

    ETL Staging DB (EDI, Excel, and miscellaneous text files are loaded here) -> Second Staging DB (incorporates ETL Staging DB and some MS Dynamics AX data) -> SSAS Cube

    I realize SSAS 2008 R2 (the version we use) does not require a relational star-schema source. However, wouldn't it be a "best practice" to replace that second staging DB (which does not use a star schema) with an interim DB that does use a star schema?

    In my mind there are several layers on a DSS, they are:

    1- The OLTP source system(s) <= The systems that support the operation of the business.

    2- The Staging area <= Where you put whatever you Extract from OLTP and eventually Transform it.

    3- The Datamart <= A dimensional (star-schema) structure where you Load the data sourced from OLTP

    4- The Delivery layer <= Structure from where you deliver data to the business.

    Cubes are a good example of a Delivery Layer but in some cases you will find user queries hitting the Datamart tables directy - in which case you have either a mix or just datamart based reporting.

    In this particular case, adding a star-schema datamart in the middle of your already working system sounds like a project on itsefl - be sure you don't shoot your own foot in the process of making the system structure "cuter".

    Any particular reason to do it? bad performance or so? How you planning to sell the project to the business?

  • PaulB-TheOneAndOnly (8/22/2012)

    1- The OLTP source system(s) <= The systems that support the operation of the business.

    2- The Staging area <= Where you put whatever you Extract from OLTP and eventually Transform it.

    3- The Datamart <= A dimensional (star-schema) structure where you Load the data sourced from OLTP

    4- The Delivery layer <= Structure from where you deliver data to the business.

    I think what you're trying to do is skip the 3rd layer mentioned by PaulB. The default SSAS Cubes that come with Dyanmics AX uses complex queries against the OLTP system and I assume you are simply trying to side load some extra data through some staging tables.

    My personal experience with building cubes against Dynamics AX was to build a custom star schema data mart and building the SSAS solution on top of that. Directly pointing a cube at a non star/snowflake schema structure means you are using some complex queries to build out your Fact and Dimensions. Short term, it's quick and easy. Long term, you'll have problems with performance during processing. Besides performance, the OLTP tables are not built to support change tracking so if your dimensions need to be SCD type 2, you'll need to build a complex support system just to handle change tracking.

  • Sorry, I don't think I explained myself properly.

    I WANT to have a step #3 like below. The original system (the one I did not build) skipped step #3. Basically, it has: a) a staging db to dump EDI, spreadsheet, and text files from external system, b) another staging db that imports all the tables from a) plus some data from our Dynamics AX system of record, c) SSAS database, d) cube built from the SSAS database. Currently, the data isn't in a measure/dimension format until step c).

    I want a data mart step in a SQL Server relational database that takes data from staging and from AX and puts them into a star schema. Then the SSAS database can pull from that relational data mart. Does that make more sense?

    richykong (8/22/2012)


    PaulB-TheOneAndOnly (8/22/2012)

    1- The OLTP source system(s) <= The systems that support the operation of the business.

    2- The Staging area <= Where you put whatever you Extract from OLTP and eventually Transform it.

    3- The Datamart <= A dimensional (star-schema) structure where you Load the data sourced from OLTP

    4- The Delivery layer <= Structure from where you deliver data to the business.

    I think what you're trying to do is skip the 3rd layer mentioned by PaulB. The default SSAS Cubes that come with Dyanmics AX uses complex queries against the OLTP system and I assume you are simply trying to side load some extra data through some staging tables.

    My personal experience with building cubes against Dynamics AX was to build a custom star schema data mart and building the SSAS solution on top of that. Directly pointing a cube at a non star/snowflake schema structure means you are using some complex queries to build out your Fact and Dimensions. Short term, it's quick and easy. Long term, you'll have problems with performance during processing. Besides performance, the OLTP tables are not built to support change tracking so if your dimensions need to be SCD type 2, you'll need to build a complex support system just to handle change tracking.

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply