Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase

SSIS package for DWH Load Expand / Collapse
Author
Message
Posted Tuesday, February 12, 2013 8:05 AM
Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Tuesday, August 26, 2014 5:41 AM
Points: 63, Visits: 557
Hi All,

Please suggest the best option for designing SSIS package for DWH load.

Here is the scenario.

I have two star schema model with total 8 Dimension and 4 fact tables.

I thought of creating three packages.

First package will select data from systems(there are five different src systems) to staging with required transformations.

Second Package will load data to dimension tables.

Third Package will load data to fact tables.

Please suggest this is the best design to go with or is there any better design option you can suggest.


Thanks
Sam
Post #1418989
Posted Saturday, February 16, 2013 12:21 PM
Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Tuesday, August 26, 2014 5:41 AM
Points: 63, Visits: 557
Any advice please...
Post #1420916
Posted Monday, February 18, 2013 5:27 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Monday, August 18, 2014 5:26 AM
Points: 119, Visits: 483
Hi Sam,

I need more details to make a suggestion:

- Do all the dimensions conform? That is, the dimensions are common to all of the fact tables.
- You said that there are 2 star schemas but 4 fact tables. Then it should be 4 star schemas, one per each fact table, or may be some fact tables are the same but with different aggregation levels?
- Which version of the SSIS are you using?

Kind Regards.


Paul Hernández
http://hernandezpaul.wordpress.com/
https://twitter.com/paul_eng
Post #1421150
Posted Monday, February 18, 2013 5:42 AM
Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Tuesday, August 26, 2014 5:41 AM
Points: 63, Visits: 557
Hi Paul,

There is one Confirm Dimension common to the two star schema model.

In the first Star Schema model i have 1 fact and 5 dim tables,
In the second (its a complex one) 5 dim and 5 fact tables respectively.

As of now i have this plan for loading the data into the above tables:

1) First package to load all the data from source to staging with all necessary transformations
2) Second package from staging tables to Dim tables
3) Third package from Staging to Fact tables.

Or Should i segregate the package design based on the above two star schema model, if yes how to handle the confirm dimension then. Or Any idea/suggestion you please share

In case of failure at any step, will the above design is efficient to work with.

Let me know if you need any other information.


Thanks
Sam
Post #1421161
Posted Monday, February 18, 2013 6:52 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Monday, August 18, 2014 5:26 AM
Points: 119, Visits: 483
Hi Sam:

The complete architecture of a DW loading process is too complex to resume it in a forum post. What I can give you are only general tips:

- I think is Ok to have a package to load the Staging Area since you should only perform the "E" part of the ETL.
- I would separate the dimensions into "slow changing dimensions" and "dimensions". I don´t know if you manage the concepts but if not you could easily find several articles on this topic.
- I would load every fact table individually in a separated package.
- I would create a log table (or not if you’re working with SSIS 2012). In this table I would log the name of the package, starting and finish execution time, number of rows updated, inserted or deleted and a comment if you want.
- I would also create “CreateDT” and “UpdateDT” in every Fact and Dimension table. This is very useful if you have and error and you want to repair your execution.

I think your initial design allows error correction, you just need to use a strategy like the create and update columns above. There are several ways to handle errors, at job level, package level, component level, it depends on many factors like the relevance of the error.

Finally I would execute the sequence of packages using a scheduled Agent Job in the SQL Server.

Good luck and have fun implementing your solution


Paul Hernández
http://hernandezpaul.wordpress.com/
https://twitter.com/paul_eng
Post #1421189
Posted Monday, February 18, 2013 6:59 AM
Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Tuesday, August 26, 2014 5:41 AM
Points: 63, Visits: 557
Thanks paul for your suggestion.I'll definitely think of these options.
Post #1421199
Posted Monday, March 4, 2013 2:03 PM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Wednesday, July 30, 2014 2:03 PM
Points: 124, Visits: 488
My advice is to create an individual package for each dimension and fact table. This makes it easier from a maintenance, troubleshooting, and auditing standpoint for a multitude of reasons. Some reasons for this is for opening packages (validation and ram use) and being able to run or schedule individual pieces of your DW.
Post #1426458
Posted Tuesday, May 14, 2013 2:49 PM
Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Friday, June 27, 2014 2:05 PM
Points: 300, Visits: 810
Concur with one pkg = one table, whether fact or dimension. the times I have bundled more than one process/table population into a package, I wound up regretting it.

We do mostly kill and fill here - most packages will do the extract, transform into staging, validate staging, partition switch to Live.

My incremental loaders are also one pkg per. Life is easier without having to figure out where that one table I forgot about 2 years ago gets loaded from ...

Post #1452850
« Prev Topic | Next Topic »

Add to briefcase

Permissions Expand / Collapse