SSIS Perfomance Tuning / Build logic

  • hi All,

    I have a table as below as my source , reference table to look up and Target. Thing is , i have millions of records to load. So, when i create , i have to read all data at once from source , so i can use the data to look up several times and insert the records as it matches Reference table. How this can be done in SSIS without affecting the performance?

     

    SOURCE

    Reference Table :

     

    TARGET DATA , should looks like this ( I know people may have many races as per my data :))

     

     

     

  • My approach would be to get this working in TSQL then put the TSQL into SSIS as your source.

    My approach (likely not the most efficient) would be to do it with a UNION.  So something along the lines of:

    SELECT [ID], [REF DESCRIPTION]
    FROM [SOURCE]
    JOIN [REFERENCE TABLE] ON [SOURCE].[RACE_ASIAN] = [REFERENCE TABLE].[REF CODE]
    AND [REFERENCE TABLE].[Column Name 1] = 'RACE_ASIAN'
    UNION
    SELECT [ID], [REF DESCRIPTION]
    FROM [SOURCE]
    JOIN [REFERENCE TABLE] ON [SOURCE].[RACE_WHITE] = [REFERENCE TABLE].[REF CODE]
    AND [REFERENCE TABLE].[Column Name 1] = 'RACE_WHITE'
    UNION
    ...

    The reason I would go with the above approach is it is easy to test (you can grab just one or more of the SELECT's in the UNION and see the value(s)).  One reason I wouldn't like this approach is it isn't scalable.  What I mean is if you add another race, you need to add another column AND another row to the REFERENCE TABLE data.

     

    EDIT - I should add that I have not tested the above for performance.  For performance sake, I think it does depend on your environment.  If you have a lot of free memory, you can probably process the whole thing in memory rather than writing it out to SQL and pulling it back in.  If memory is tight, you may benefit from writing it to SQL as is and then pulling it back into memory using my above method.

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

Viewing 2 posts - 1 through 1 (of 1 total)

You must be logged in to reply to this topic. Login to reply