Level of Normalisation

  • Please could someone let me know upto how much we could go for normalisation, i have a Owner database

    then one owner can have many sites, one site can have many position, one position can have many position history, one history can have many collections.

    so my client in the end wants the performance based report of collections.

    but the performance is very very slow....As in a year i get around 400000 records from collections table for one owner.

    please let me know how much denormalisation is required in such case.

  • First things first.

    What sort of Analysis are you going to do. From your post I am assuming these are some sort of machines that you are looking at. I have worked in this area before and there are a few ways to analyse the data.

    I would expect the collection fact table at the centre to contain:

    Collection Identifier

    Collection Date

    Machine Identifier (to identify the model of the machine)

    Position Identifier

    Site Identifier

    Owner Identifier

    Collection Amount

    Then I would expect dimension tables for Owner, Machine, Site, Position, and calendars

    This would allow you to analyse collections

    by day, week, month etc.

    by site,

    by owner,

    by position,

    by machine etc

    Hope this helps.

    Mark.

  • Thanks Mark, yes you are exactly correct, I am looking the Machine data,

    From your answer , do you mean to say I should put all the Primary keys in Collections table?

    Does that mean Denormalising the Collection table to 0 level.

    The collection table also has other things like Operators relating to history table then there are terms table to calulate the shares etc.

    The table already have total 45 columns.

    Which area you from, I m looking after the Gaming Machines Industry

  • Yep all the primary keys from the dimension should be in the collections fact table. I would expect the collections fact table to consist of a collection id (primary key for the collections table), in addition there would be the additional foreign keys like machine id that links directly to the machine. The foreign keys will be important as these will be the items that the users wish to use in their analysis. I guess this does mean denormalising the collections data. It may also require a bit of work in the load process. The collections data you receive may be for position and site without specifying the machine!

    Having worked in this area before I understand how wonderfully overcomplicated it can be. Even the money you receive in the collection may not be all yours and may get split several ways.

    I work in the UK and supported a Leisure machines system for a couple of years and went onto support and develop the implementation of that system into a data warehouse. Business Intelligence can really make it so easy for people to see what is going on with the revenue and the costs associated with machines.

    All the Best.

    Mark.

  • Thanks Mark for your help 🙂 I should denormalise then.

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply