Degenerate dimension or regular dimension

  • I am designing a retail sales data warehouse. The FactRetailSales table is at the line item level of granularity.

    Placement of the InvoiceNumber as a degenerate dimension would usually be ideal...

    Order numbers, invoice numbers, and bill-of-lading numbers almost always appear as degenerate dimensions in a dimensional model." -Kimball

    However, Kimball also says:

    Although surrogate keys aren’t typically assigned to degenerate dimensions, each situation needs to be evaluated to determine if one is required. A surrogate key is necessary if the transaction control numbers are not unique across locations or get reused. For example, the retailer’s POS system may not assign unique transaction numbers across stores. ... Obviously, control number dimensions modeled in this way with corresponding dimension tables are no longer degenerate.

    In my case, the unique key for an invoice is a combination of Invoice Number and Store Number.

    So...

    Can I use the combination of my InvoiceNumber degenerate dimension and the either StoreNumber or even Store surrogate key as my Natural Key in the FactRetailSales?

    or

    Do I break the Invoice out to its own dimension (DimInvoice)? Then do I bring StoreNumber (or StoreKey) along with it? Wouldn't that make it a snowflake?

  • Personally I would go for the first option.

    Degenerate dimension of InvoiceNumber and Store SK.

    There is no use in putting it into an extra dimension. There are no descriptive attributes and the number of rows is equal to the fact table.

    I can also imagine there would be no actual analysis against the InvoiceNumber, right?

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • Thanks for the quick reply.

    There are a few elements at the invoice number level that are repeating in the FactRetailSales table (InvoiceNumber, WorkOrderNumber, InvoiceDate and WorkOrderDate. I'm fine with flattening them all out to maintain the Star formation.

    My main concern is that there is going to be quite a bit of analysis at the invoice level.

    I am designing this DW for a tire and car service retailer. The executives will want to know when a customer came in for a tire, did they also get services (and all possible combinations). We will want to do a lot of customer profiling across the invoice.

    I'm using SQL Server 2012 BI tools, I'm only worried I'm going to make my life more difficult when I start working on the cube design.

  • If there are invoice level reports being built, would it be acceptable to create a single composite key of InvoiceNumber + StoreNumber?

    And if the requests for Invoice level reporting becomes significant, an aggregated fact table on InvoiceNumber?

    Thanks again,

    Gabriel

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply