Select rows from table b not in table a?

  • I have made a small amount of progress with learning SSIS. But I am struggling with it more than I have with any other tool, language etc. I don't know why. It just seems very unintuitive to me and a bit buggy?

    I am trying to start simple. I have a flat file that I am successfully connecting to and successfully reading (only by setting Ignore Failure on all output columns). I also have a simple query I am running. ie. One Flat File Source and one OLE DB Source. I would like to select all the rows in query b that are not found in extract file a. I played around with Merge Join and got that to work. But I don't think that is what I need. What I really want is so easy in SQL.

    SELECT * FROM tableA A

    WHERE A.account_id NOT IN

    (SELECT account_id

    FROM tableB)

    That is it. How could I do this very simple task in SSIS?

    Thanks a ton in advance!

    Chris

  • Nope - pretty much bang on. Use the join, with sorted inputs, and left outer it.

    For the inverse, you can use a lookup component.

    To open an old can of worms, stay away from Where not in. It's non-sarg query and will hurt. Google it 🙂

    Cheers,CrispinI can't die, there are too many people who still have to meet me!It's not a bug, SQL just misunderstood me!

  • For small datasets, the Sort+Merge Join with an Outer Join is a good fit. It behaves most like Join syntax you'd be used to in T-SQL. You can then use a Conditional split to redirect rows that are in your flat file but not in your T-SQL with an expression like ISNULL(table_key_id). You can play with the Sort settings for your columns to change things like case sensitivity. You might also want to use a Derived Column transform before your Sort to clean-up data (i.e. TRIM(text_column)).

    One other note, sort your smallest columns first, this can dramatically improve performance for large datasets.

    Alternatively, you can do the same thing using a Lookup. This is definitely faster, but can be difficult if you are trying to match on text data.

  • For the problem above, you could not use the lookup as the source for the lookup is a text file.

    A tip on the sort and cleaning: Do what you can in SQL. Use an order by in SQL. It's faster than SSIS at it. Do any trims , joins etc in SQL. Trimming a string in SQL means that potentially, less data could arrive for SSIS to deal with. It is obviously a good thing...

    Cheers,CrispinI can't die, there are too many people who still have to meet me!It's not a bug, SQL just misunderstood me!

  • I'm not sure I understand why you say lookup wouldn't work with a text source (or a text reference). Albeit slower thanks to the lack of indexing, I don't see why it wouldn't work. Once you set up the source and/or the reference file to appropriate read said file, what does the internal structure of the file have to do with anything?

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Matt Miller (1/30/2008)


    I'm not sure I understand why you say lookup wouldn't work with a text source (or a text reference). Albeit slower thanks to the lack of indexing, I don't see why it wouldn't work. Once you set up the source and/or the reference file to appropriate read said file, what does the internal structure of the file have to do with anything?

    A lookup component needs an OLEDB source for it's reference data. It cannot use a flat file.

    Regarding the indexing, by default a lookup component caches everything and once the cache is hot, external indexing means nothing. It's internal to the SSIS engine. If you use partial caching, then the index on the lookup query is critical.

    Data from anywhere can "flow over" a lookup component. However, the component needs to get it's refernce data from OLE DB.

    Cheers,CrispinI can't die, there are too many people who still have to meet me!It's not a bug, SQL just misunderstood me!

  • It's those little things you don't notice until you want to use them... You're absolutely right.

    Learned yet another tidbit.

    Thanks!

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • I am starting to get the hang of it now. Thanks guys.

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply