DW design help

  • rocky_498

    SSCertifiable

    Points: 6493

    Hi Guys,

    I need advice. I am in the designing phase of building a DW.

    Question 1:- My source has (Customer/Customer Emergency Contact and Customer ResponsibleParty data in one source file). What do you guys think should I create one "Dim_Customer" to store all information or create three Dim tables ("Dim_Customer/Dim_EmergencyContact and Dim_ResponsibleParty) and link all tables with Customer id and Link Dim_Customer.CustomerKey to the Fact table?

    Question 2:- I have two source tables "Invoice and Invoice_Detail". Should I create two Dim tables "Dim_Invoice and Dim_InvoiceDetail" and link both tables with Invoice# and Link Dim_Invoice.InvoiceKey (SurgetKey) to the Fact table?

    I am planning to build a Star Schema DW.

    Please any advise.

    Thanks in advance.

    • This topic was modified 8 months ago by  rocky_498.
  • Jeff Moden

    SSC Guru

    Points: 996863

    For question #1, what's going to happen in the single table design when (and it will happen) you end up with more than one emergency contact and more than one responsible party?

    Some people would suggest a single table with one row per person whether it's a Customer, EmergencyContact, or ResponsibleParty where there's "Type" column to distinguish the 3 and an extra column to identify the related customer for the later 2.  I'd probably NOT go that way.  If EmergencyContact and ResponsibleParty rows are identical, perhaps a "Customer" and a "Contact" table would suffice.  Just remember that you can have multiple phone numbers, emails,  pages, whatever for contacts, as well, and all of those should probably be stored in a separate table instead of having a wad of null columns in a table.  I'd likely have a "Bridge" table between Customer and Contact called "CustomerContact" consisting only of the IDs of customers and contacts.  You'll end up with more of a snowflake than a star in that area.

    For question #2, yes... I'd have an "Invoice_Header" and "Invoice_Detail" table.  Also, seriously consider how you design the Invoice_Detail table.  After a month or so of and invoice coming into existence, all changes and backorders will likely be resolved and nothing else about the invoice or its details will ever change again.  So, planning for the future, instead of backing up what will become years of data that will never again change, plan on partitioning at least the Invoice_Detail table.  My personal favorite is Partitioned VIEWs rather than Partitioned TABLEs because Partitioned TABLEs have a whole lot of "gotchas" when it comes to restores, especially if you need a partial restore for a development or test environment.  They don't warn you of things like that in the documentation for Partitioned Tables... although they incorrectly "tip" that Partitioned Tables are better Partitioned Views... which I don't agree with.

    But, as with all else in SQL Server, "It Depends" and these are just suggestions from an old dude that's been screwed by it all both ways at one time or another.  Choose carefully. 😀

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".
    "If "pre-optimization" is the root of all evil, then what does the resulting no optimization lead to?"

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 2 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply