Master Data - Relational or Dimensional modeling?

  • Dears,

    In my company, we have certain master data domains like customer; Vendor; Product; Materials; Bill of Materials.

    So far we have been doing our modeling using relational modeling in the cloud.

    It consists of entire tables that we push from SAP ECC and SAP MDG like MARA; MVKT, etc... then we have views on top of it.

    So, right now we have : Landing Zone (csv files added into this bucket); Raw Zone (Table like-to-like with SAP meaning, equal with the csv files we bring ) ; Harmonized zone (tables transformed in a way that they have only the required attributes)  and on top of it , still in harmonized zone, we have views.

    This is the current case for everything. Transactional and Master data

    Now, we have onboarded a new vendor (someone to build a new solution inside Cloud).

    They are building a new modern data platform using Cloud hence pushing transactional data as well as master data into it. Actually they will use the existing one and keep exactly same as we had the landing zone and the harmonized zone.

    But they will transform the harmonized zone.

    Harmonized zone will have master data and transactional data as dimensional modeling and on top of it the views.

    I can understand that transactional data can be modeled as dimensional modeling mixed with master data and create dimensions and facts to serve a certain product.

    But here, they defend that master data data which is separated from transactional data will be modeling in dimensional modeling.

    Would it be better to have the master data as relational instead of using Master data as dimensional modeling?

    Becauise multiple projects will take master data from this platform and I don't see the advantage of modeling using dimensional modeling instead of using the regular relational data model

     

    What is your opinion on this?

    Thank you very much,

    Pedro

  • It's real simple... have them prove that the performance is better with repeatable, demonstrative code.  One good test is worth a thousand expert opinions and until they can prove it code, it's only an opinion. Period.

    I'll also state that claims of supposed "Best Practices" aren't worth the powder to blow them to hell.  They're frequently manifested by a million people that are <insert drum roll here> are all doing it wrong because of the "bandwagon" effect.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I'm with Jeff.

    Let's just clarify a bit though. Dimensional modeling is a good thing, for certain kinds of queries. It's not a good thing for other kinds of queries. So, the question isn't Dimensional modeling, good or bad. The question is, what kind of queries are you running? Are you running more of a data collection and point lookup set of queries, Online Transaction Processing (OLTP)? Then traditional normalization is likely the way to go. Are you running large scale aggregations & analytics? Then dimensional is likely to support you better. Plus, if you're looking at analytical queries, you should also be exploring whether or not colunstore indexes are going to support you better than rowstore indexes, but that's a different discussion.

    In short, there isn't a good/bad storage mechanism. There are storage mechanisms better suited for your workload. Figure that out through analysis, and as Jeff says, good testing.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • There's nothing to stop you from having the normalised model and building a dimensional model on top of that. Then you have the best of both worlds when it comes to querying. Of course it's additional work, but worth it if you will need to serve both types of queries.

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply