Single DW multiple sources

  • Hey all,

    We currently have multiple systems that we report on. They all contain information on the same topic - but they are systems built externally by different companies.

    However one major goal is to be able to report on these systems together.

    Here is my general plan within SSIS

    1) Create denormalised structure for each system.

    2) Delete stuff for that system in unversal tables, and copy over the tables.

    Now there are a couple of things that have sprung to mind.

    Firstly i am doing step one to minimise the load on the central tables. So it should hopefully avoid much locking if multiple SSIS packages (one for each system) are running in parrallel. Is this the best way to handle this?

    Secondly - keys..... some are ints - others are varchar. Hows the best way to split the primary keys? I am planning to have a source column to identify the system. So am i better off having a composite key for every table - or the other option i can think of is to append a letter to signify the system to each and every PK and FK.

    I would appreciate any thoughts.

    Dan

  • danielfountain (5/17/2013)


    We currently have multiple systems that we report on. They all contain information on the same topic - but they are systems built externally by different companies.

    However one major goal is to be able to report on these systems together.

    if all systems are on the same topic then it looks to me the sensible thing to do is to build a single datamart data warehouse.

    I would focus on the design of the single datamart e.g. the FACT and DIM tables that would store the data then I would plan the ETL processes. Most probably I would create staging tables mimicking as close as possible what you have on the operational systems then define how to map such data into the data warehouse.

    Hope this helps.

    _____________________________________
    Pablo (Paul) Berzukov

    Author of Understanding Database Administration available at Amazon and other bookstores.

    Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
  • we have the concept of SourceSystemKey to identify where a record came from. that plus the business key of the table is all we need.

    We have DimSourceSystem that we manually keep updated with system connection info and whether or not that system is enabled (every so often there will be a weather event that knocks out a line of business so we avoid that one until it is back online).

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply