August 16, 2006 at 5:40 am
Howdy
Lately a question of how to develop SSIS systems has arisen in our office.
Now we all know that every developer has a certain style they prefer but I would like to know what ideas a large, experienced group thinks.
The need:
To import server tables from FoxPro and Excel.
To parse this data by calling stored procs.
To return invalid data via excel export.
To clean the mess made at import and prep for next time.
The main ideas are:
Create 1 DTSX with everything in it.
Create 1 DTSX with individual dataflows.
Create individual DTSX for each step.
What do the experts think?
Also, does anyone have a link to some best practices; preferably MS published SSIS best practices?
Thank you for your opinions.
MMDG
August 16, 2006 at 6:59 am
Hi George,
While you are waiting for an expert opinion, I thought I'd pop in my 5 cents worth of novice opinion.
You have N tables/spreadsheets to import, and each of them will go through (possibly) independant scrubbing routines before being imported into different tables?
I think the import of a specific table/spreadsheet could be done in a single SSIS package (1 DTSX), that would follow steps something like:
1. check for file existance - I used a .Net script
2. prepare staging tables (if you have them) - truncate in my case, as a SQL task
3. import data - data flow task
3.1 'open' data - excel source in my case
3.2 parse data - conditional split in my case
3.2.1 - route error information and data to a error table
3.3 prepare data - data transformation in my case
3.3.1 - route error information and data to a error table
3.4 'store' data - OLE DB Destination in my case
4. move file to archive location
Now I follow the exact same process for each of my spreadsheets (I have 3) - but they are independant, and have no common tables etc.
Finally, I have an 'orchestration' SSIS package that invokes the 3 different packages I have. So I have 4 packages in total - 1 for each file source to be handled, and the orchestrator. Why? Well, this give my 1 method to invoke to run the entire import, as a SQL agent job invoking the Orchestration Package, on a scheduled basis. I also have 3 SQL agent jobs, unscheduled, - 1 for each of the 'child packages'. If I need to re-do only a part of the batch - let's say a user was locking the file, or an emergency change to 1 file needs to be processed - I simply invoke the relevant package for that specific file. No need to climb into a DTSX package and try and execute parts of it to handle the specific file.
Yes, it takes more time to create, but it follows a 'modular' philosophy, and gives mes easy re-use, and manageablity in terms of execution.
I look forward to hearing from more experienced people on this...
CiaO 4 NoW
Viewing 2 posts - 1 through 2 (of 2 total)
You must be logged in to reply to this topic. Login to reply