Import from multiple AS400 tables and place into one SQL Table

  • I am trying to convert a GROUP of SQL 2000 DTS packages over to SQL 2008 SSIS. The DTS package consists of 1 master DTS package that calls 47 other DTS packages that port data from our AS/400 to our SQL Server. The 47 DTS packages do the following:

    1.) Create a temporary table

    2.) Imports the data from the table on the AS/400 and places it into the temp table.

    3.) Takes the data from the temp table and places it into the final "MASTER" table.

    4.) Drops the temp table.

    The tables are named StoLim01 through StoLim47

    I would prefer to use the looping constructs in SSIS to do this instead of having 48 DTSXs in a project. I have tried numerous variations of getting the loops to work but I have been unable to accomplish it.

    Any help with this would be GREATLY appreciated as I am just now learning SSIS and am a bit confused. :crazy:

    Thank You,

    Chris

  • Hi Chris,

    It is definitely possible to do it with a loop in SSIS.

    Another question though...

    Do all the source tables (and temp tables for that matter) have the same layouts?

    If they don't, then you won't be able to re-use the same data flow anyways...so it might be better to just have one "Execute SQL Task" that contains all the create statements for the temp tables. You can follow it up with a separate "Data Flow Task" for each table you want to extract.

    The reason I say that (without know everything about your environment), is that a loop could over-complicate everything unless the layout of the temp tables are exactly the same. With that said...if the temp tables all have the same layout, you might be even better off just having one temp table with an extra column to indicate the source...

  • Martin,

    Yes the tables all have the same layout. I was considering doing a massive UNION but even that comes with problms as the Resulting table has a field which contains the number of the file that the record came from.

    for example:

    Table01 (GLNMFM VarChar(20), GLNMTO VarChar(20))

    Table02 (GLNMFM VarChar(20), GLNMTO VarChar(20))

    Table03 (GLNMFM VarChar(20), GLNMTO VarChar(20))

    FinalTable (CLGROUP CHAR(2), GLNMFM VarChar(20), GLNMTO VarChar(20))

    Get all the records from TableXX and put into FinalTable but add a derived column and populate it with the XX table number.

    The other issue I have is I would like to make this thing as dynamic as possible as the AS/400 team is always adding new tables on their system. I would really like to have something that was driven by meta-data in a SQL table.

    Thank You,

    Chris

  • AmcAmx (8/6/2010)


    Martin,

    Yes the tables all have the same layout. I was considering doing a massive UNION but even that comes with problms as the Resulting table has a field which contains the number of the file that the record came from.

    for example:

    Table01 (GLNMFM VarChar(20), GLNMTO VarChar(20))

    Table02 (GLNMFM VarChar(20), GLNMTO VarChar(20))

    Table03 (GLNMFM VarChar(20), GLNMTO VarChar(20))

    FinalTable (CLGROUP CHAR(2), GLNMFM VarChar(20), GLNMTO VarChar(20))

    Get all the records from TableXX and put into FinalTable but add a derived column and populate it with the XX table number.

    The other issue I have is I would like to make this thing as dynamic as possible as the AS/400 team is always adding new tables on their system. I would really like to have something that was driven by meta-data in a SQL table.

    Thank You,

    Chris

    Ok...that makes sense. You could still achieve the necessary with the union, for example:

    select GLNMFM, GLNMTO, 'Table01' from Table01

    union

    select GLNMFM, GLNMTO, 'Table02' from Table02

    ...

    But it is definitely better to drive the loading from metadata if more tables are added frequently.

    I would suggest the following approach:

    -- Create a table for the metadata

    -- Create one permanent staging table

    -- Create an SSIS package that does the following:

    -- Truncate the staging table

    -- Select all data from the metadata table, and store the results in a variable

    -- Use a For Each loop (with ADO enumerator) to loop through the records

    -- Use expressions to dynamically set the SQL query for the source connection

    -- Populate the staging table, and have one process populating the final table

    I hope the above makes sense. Unfortunately I don't have the luxury of time to give you screenshots etc at the moment, but give the above a try and ask if you get stuck along the way.

    Martin.

  • Martin,

    I was able to complete the following items:

    -- Create one permanent staging table

    -- Create an SSIS package that does the following:

    -- Truncate the staging table

    -- Select all data from the metadata table, and store the results in a variable

    -- Use a For Each loop (with ADO enumerator) to loop through the records

    But I am missing the boat oin what you mean for the following ones:

    -- Use expressions to dynamically set the SQL query for the source connection

    -- Populate the staging table, and have one process populating the final table

    Additional insight would be appreciated.

    Thank You,

    Chris

  • AmcAmx (8/9/2010)


    Martin,

    I was able to complete the following items:

    -- Create one permanent staging table

    -- Create an SSIS package that does the following:

    -- Truncate the staging table

    -- Select all data from the metadata table, and store the results in a variable

    -- Use a For Each loop (with ADO enumerator) to loop through the records

    But I am missing the boat oin what you mean for the following ones:

    -- Use expressions to dynamically set the SQL query for the source connection

    -- Populate the staging table, and have one process populating the final table

    Additional insight would be appreciated.

    Thank You,

    Chris

    Hi Chris,

    Sorry for the delay in replying.

    If I understand your situation correctly, you need to now extract information from the various different source tables.

    If you were able to successfully implement the For-loop, you should have the source table's name in a variable for each enumeration.

    You will now have to create a data flow task within the For-loop, and the challenge is that you have to change the source table of the query in your Data Flow source connection each time (based on the table name in your variable).

    The only way to achieve this, is to dynamically change that query by using an expression. See books online for a reference.

    The Data Flow will populate your staging table, but according to the provided info you will need to know from which source table the data came later on. The best way (in my opinion) to do this, is to select your variable (which is the source table name) as one of the columns in the select statement. You should have a field in your staging table to insert this value into, and that will make it easier for you later on when extracting that information for the final insert.

    The final insert can be a query or stored procedure, outside of the For-loop. Pleas let me know if this doesn't make sense...

    Martin.

  • Martin,

    Thank you again for the help. What you are saying makes total sense. The problem I am having is the implementation. Currently (as a test) I have populated the Data Flow Task inside the ForEachLoop with an OLE DB connection to a dummy file structure that has a Precidence Constraint to a flat file Destination which I dummy up by changing the connection string property. Actually I think the lights just came on... let me try something and I will get back with my results...

    ThanX again !

    Chris

  • Martin,

    Ok.. guess my idea will not work as it seems that you cannot change the OLE DB OpenDataSet property during execution (or at least I am not able to find a way to do it). Because of this I am not able to connect to each of the individual tables on the AS400 to retreive ther data. If you have any ides how it can be changed I am always open to them !

    Thank You,

    Chris

  • There might be a way to do this. Look at the properties of the Data Flow task itself...it will allow you to change the properties of the connections within the Data Flow at runtime (and you could use an expression).

    I was able to successfully implement a similar solution to dynamically change Oracle queries at runtime...

    If this does not solve your problem, could you post a screenshot(s) to give me a better idea of what you have done already?

  • See the attachment...

  • Martin,

    Very interesting thing I have here.... Seems my OLE DB Source Connection does not have those properties although my SQL Server Destination does... hmmm....

    Here are some screen shots !

  • Chris,

    What about the Data Flow task itself. The screenshot I posted was the properties of the Data Flow Task, and not the Source Connection...

    Another question: Does your OLEDB Source Connection allow you to specify a query? If so, then there must be a property (that relates to the query) that you can change at runtime with an expression.

    What are the properties of your OLEDB Source Connection?

  • What about the Data Flow task itself. The screenshot I posted was the properties of the Data Flow Task, and not the Source Connection...

    The Data Flow Task does not have those properties... it shows the properties of the Destination Connection.... (Which does not make sense to me).

    Another question: Does your OLEDB Source Connection allow you to specify a query? If so, then there must be a property (that relates to the query) that you can change at runtime with an expression.

    Yes, I am now trying to choose "SQL Command From Variable" as a Data Access Mode, which will then allow me to choose a variable name. I have created another variable with the scope of the FOR loop that builds a SQL query based upon the current filename (from the filename variable).

    For example: "Select GLNMFM,GLNMTO FROM " + @[user:CurrentFileBeingProcessed]

    .... BUT....(yea again another BUT).... I get an error when I try to assign the variable in the pick list. For some reason it does not like the CurrentFileBeingProcessed variable.

    FWIW.. I got this ideas from here :

    http://blog.stevienova.com/2009/04/30/ssis-two-ways-using-expressions-can-make-your-life-easier-multi-db-select-non-standard-db-select/

    What are the properties of your OLEDB Source Connection?

    Here is a screenshot of the properties of the OLEDB Source Connection that is inside the DFT (the one in the loop)

    Chris

  • Martin,

    I got it working ! FINALLY !!! Here is what I had to do....

    The variable that I created to store the SQL Command would not parse correctly due to the fact that the CurrentFile Variable was <EMPTY>. So when the user interface tried to parse the settings it gave me an error. Once I populated the FileName with a ligitmate table name I was then able to assign the SQL Select Variable in the Variable Name Pick List in the OLE DB Source Editor.

    Thank You once again for your assistance on this....I has been greatly appreciated.

    BTW... I think I will have to write a HOW-TO on this and post it for the community.

    THANX AGAIN !

    Chris

  • AmcAmx (8/11/2010)


    Martin,

    I got it working ! FINALLY !!! Here is what I had to do....

    The variable that I created to store the SQL Command would not parse correctly due to the fact that the CurrentFile Variable was <EMPTY>. So when the user interface tried to parse the settings it gave me an error. Once I populated the FileName with a ligitmate table name I was then able to assign the SQL Select Variable in the Variable Name Pick List in the OLE DB Source Editor.

    Thank You once again for your assistance on this....I has been greatly appreciated.

    BTW... I think I will have to write a HOW-TO on this and post it for the community.

    THANX AGAIN !

    Chris

    Good job Chris. Glad I could help...and that you got it working.

    I think the "How-To" is a great idea, and could benefit many others.

    One final bit of advice about the issue when the package attempts to parse the SQL command. There is a property you can set on the Connection Manager called "DelayValidation". If you set it to "true", the package will not try and validate the package in the pre-validation/pre-execution phase.

    You could also set this property for the Data Flow Task, and by doing that ensure that the package still works even though there is no value in the variable when execution starts.

    Have a good one.

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply