• robert.wiglesworth (11/29/2016)


    Thanks Luis, I thought of that too, but didn't really want to do that because it still has to load the whole large table. Unfortunately, I don't have access to add the table to that server/database. I only have read access. And asking for help from the DBA of that server is probably going to not go anywhere... at least it would take weeks or months for even a response.

    Unfortunately, if you can't get a linked server, the only way you're going to achieve this is by loading at least one of the tables in SSIS.

    I just did a little bit of testing (using the StackOverflow Database, as it's pretty big), and doing it this way, the best way seem to be to use a Source which grabs all the orders (just get the columns you need), and the do a Lookup to the Employee table.

    The test I did for Stackoverflow was initially to get all users that had logged this year, and then get their Posts. This caused by PC to try and cache EVERY POST, as that was my lookup (a laughable endeavour, which basically ground my PC to a halt).

    Doing a SELECT ALL from the Post table, with a lookup of SELECT * FROM StackOverflow.dbo.Users U WHERE DATEPART(YEAR, U.LastActivityDate) = 2016 gave pretty quick results. I'll admit it's still running as I type this, but my PC is still running (It has 1.3 Million rows cached right now and isn't struggling with it's tiny 8GB of RAM), and the Database is hpsted on a test Desktop for us to play around on.

    Tl;DR Lookup doesn't really cache everything, just what you need. Create a Dataflow that simply gets everythingfrom your orders table, then do a Lookup to your Employees table where your region = 1 or 2. SSIS will cache only the applicable Employees, and then complete your lookup as it writes out the data to your destination.

    Thom~

    Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
    Larnu.uk