SSIS package for DWH Load

  • Hi All,

    Please suggest the best option for designing SSIS package for DWH load.

    Here is the scenario.

    I have two star schema model with total 8 Dimension and 4 fact tables.

    I thought of creating three packages.

    First package will select data from systems(there are five different src systems) to staging with required transformations.

    Second Package will load data to dimension tables.

    Third Package will load data to fact tables.

    Please suggest this is the best design to go with or is there any better design option you can suggest.

    Thanks

    Sam

  • Any advice please... 🙂

  • Hi Sam,

    I need more details to make a suggestion:

    - Do all the dimensions conform? That is, the dimensions are common to all of the fact tables.

    - You said that there are 2 star schemas but 4 fact tables. Then it should be 4 star schemas, one per each fact table, or may be some fact tables are the same but with different aggregation levels?

    - Which version of the SSIS are you using?

    Kind Regards.

    Paul Hernández
  • Hi Paul,

    There is one Confirm Dimension common to the two star schema model.

    In the first Star Schema model i have 1 fact and 5 dim tables,

    In the second (its a complex one) 5 dim and 5 fact tables respectively.

    As of now i have this plan for loading the data into the above tables:

    1) First package to load all the data from source to staging with all necessary transformations

    2) Second package from staging tables to Dim tables

    3) Third package from Staging to Fact tables.

    Or Should i segregate the package design based on the above two star schema model, if yes how to handle the confirm dimension then. Or Any idea/suggestion you please share

    In case of failure at any step, will the above design is efficient to work with.

    Let me know if you need any other information.

    Thanks

    Sam

  • Hi Sam:

    The complete architecture of a DW loading process is too complex to resume it in a forum post. What I can give you are only general tips:

    - I think is Ok to have a package to load the Staging Area since you should only perform the "E" part of the ETL.

    - I would separate the dimensions into "slow changing dimensions" and "dimensions". I don´t know if you manage the concepts but if not you could easily find several articles on this topic.

    - I would load every fact table individually in a separated package.

    - I would create a log table (or not if you’re working with SSIS 2012). In this table I would log the name of the package, starting and finish execution time, number of rows updated, inserted or deleted and a comment if you want.

    - I would also create “CreateDT” and “UpdateDT” in every Fact and Dimension table. This is very useful if you have and error and you want to repair your execution.

    I think your initial design allows error correction, you just need to use a strategy like the create and update columns above. There are several ways to handle errors, at job level, package level, component level, it depends on many factors like the relevance of the error.

    Finally I would execute the sequence of packages using a scheduled Agent Job in the SQL Server.

    Good luck and have fun implementing your solution

    Paul Hernández
  • Thanks paul for your suggestion.I'll definitely think of these options.

  • My advice is to create an individual package for each dimension and fact table. This makes it easier from a maintenance, troubleshooting, and auditing standpoint for a multitude of reasons. Some reasons for this is for opening packages (validation and ram use) and being able to run or schedule individual pieces of your DW.

  • Concur with one pkg = one table, whether fact or dimension. the times I have bundled more than one process/table population into a package, I wound up regretting it.

    We do mostly kill and fill here - most packages will do the extract, transform into staging, validate staging, partition switch to Live.

    My incremental loaders are also one pkg per. Life is easier without having to figure out where that one table I forgot about 2 years ago gets loaded from ...

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply