May 5, 2009 at 7:55 am
I have large files. One with 100 Million records and the other with 7 Million. The Merge Join is taking too long to complete. Is there an alternative faster way of joining these files?
May 5, 2009 at 8:57 am
I think I found a way to get the job done without taking so much time using a merge join. I added anothe data flow where I load the contents of the larger file into a cache transform. Then once that data flow completes I execute my original dataflow and use a lookup transform to get the data I need from the Cache using the Cache Connection used to cache the larger file in the first data flow.
May 5, 2009 at 9:37 am
I would load the data from the files into a table, then use a sql task to create some indexes on the tables, and then join them using the appropriate JOIN deepening on your requirements. Then run a sql task to remove the tables.
This should run a lot faster than doing the processing entirely in SSIS
May 5, 2009 at 10:06 am
Will give it a try...Thanks
May 5, 2009 at 10:06 am
steveb (5/5/2009)
I would load the data from the files into a table, then use a sql task to create some indexes on the tables, and then join them using the appropriate JOIN deepening on your requirements. Then run a sql task to remove the tables.This should run a lot faster than doing the processing entirely in SSIS
That's what I would do too - except for the remove tables bit. I'd drop the indexes & then truncate - or truncate & then drop - not sure which is faster. Then the tables are empty and ready for the next load - if you are doing this regularly, of course.
Phil
Viewing 5 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply
This website stores cookies on your computer.
These cookies are used to improve your website experience and provide more personalized services to you, both on this website and through other media.
To find out more about the cookies we use, see our Privacy Policy