Multiple dimensions from one dimension table

  • I'm currently working to build a cube from our dimensional data warehouse.

    The way the data has been modelled in some cases is as a fact with a one-to-one relationship to the dimension. For instance you have a Transaction fact then a Transaction dimension with a corresponding record for each fact. This results in a dimension with as many rows as the fact.

    Also these dimension records can be very wide and contain multiple attributes and natural hierarchies, so in an SSAS OLAP cube I would like to break these into separate dimensions, e.g. Transaction Status & Transaction Type etc.

    This problem is similar to modelling from a single denormalised table.

    I'm not able to do anything much about the DW architecture.

    My question is: Is it OK to create multiple dimensions off the same dimension table, just using a subset of the columns for each one and how is performance affected in this scenario. There may be only 5 Transaction Statuses but they are joined to the fact by the Transaction Key so the dimension has a Transaction Key attribute which may have millions of values and then a Status attribute with only five. I'm only interested in the status and would hide the Key.

    Where will the above issues hit most, processing time or performance? Is there a better way to model this through the DSV or should I be using multiple hierarchies on the same Cube dimension?

    Many thanks.

  • My opinion: At this point, you're just trying to make a less than ideal back-end design work. I've been there a few times, and it really doesn't matter how you decide to jump through that hoop.

    Whether you create a derived table in the DSV, include all the attributes in one dimension or create multiple different dimensions from the same dimension table, the "select distinct..." query executed against your data source is still the same.

    Creating separate dimensions may be easier to navigate from an end-user point of view, and you may benefit a little from the reduced size in data store of that dimension. Aggregates may also help out a little in that scenario, from a performance point of view.

    Ultimately performance will be impacted by the underlying design, but I don't see too many other options. I'd definitely recommend that you test the behavior of the one v. multiple dimensions, but I don't expect it to be too different as the granularity to the fact will still be the same.

    I'd love to be proven wrong, but an inadequate back-end dimensional model will always come back to haunt you in some way...

Viewing 2 posts - 1 through 1 (of 1 total)

You must be logged in to reply to this topic. Login to reply