Not in

  • I wonder if anyone can help me decided the best coding to choose for this job

    Insert into tablec

    Select a.col1, a.col2, b.col1, b.col2 from tablea a join tableb b on a.col3 = b.col3

    Insert into errortable

    Select * from tablea where col1 not in (select col1 from tablec)

    The 'not in' statement is taking a long time as you would expect. Does anyone have any ideas on how to change the statement to speed it up

    All help appreciated

  • Try

    select

    a.*

    from tablea a

    left outer join table c

    on a.col1 = c.col1

    where c.col1 is null

    Might not be much faster though

    Steve Jones

    sjones@sqlservercentral.com

    http://www.sqlservercentral.com/columnists/sjones

    http://www.dkranch.net

  • Select a.col1, a.col2, b.col1, b.col2 from tablea NOT EXIST (select 1 from tableb b where a.col3=b.col3)

  • Steve

    Beware of using your method, I have seen a case where that didn't work.

    I had 2 very similar queries, one worked, the other didn't. When I examined the query plans, it became a bit more apparent why there was a difference.

    Using your example, if SQL Server chooses to filter rows from table c early in the query, it will not find any rows where col1 is null, and will therefore return no rows for the query as a whole.

    However, if it decides to filter rows from table c at a later point in the query plan, the query works as expected.

    I don't know if this is expected behaviour or a bug.

  • ian - i don't see how what you describe can happen unless the query is more complex than the one shown. I assume you are not suggesting that the query plan adopted affects the records returned. However, if this is a problem, it's potentially a major one. Do you have the code of the two statements, by any chance? Failing that, can you give any more info on the circumstances in which this happens?

    Tim Wilkinson

    "If it doesn't work in practice, you're using the wrong theory"
    - Immanuel Kant

  • stax68

    You're right, the query was considerably more complex than this.

    It happened a while ago, so I don't know if I will be able to find the details. If I do, I'll post it here.

    There is an example of something similar on page 21 of the 'Advanced Transact-SQL for SQL Server 2000' book by Itzik Ben-Gen and Tom Moreau.

  • SO WE ARE SAYING THAT IS NOT THE WAY TO GO?

  • No - I think it *is* the way to go. Joins are the standard and normally the best method, and Steve Jones' suggestion is certainly the way I, and most TSQL developers I have met, would approach this.

    But as Steve says, the performance improvement way well be quite minimal. If you need to improve performance further, one possibility is to look at indexing the join columns (a.col1, c.col1).

    I'm very interested to see Ian's problem, but whatever it was, was certainly very uncommon and it certainly won't affect your very simple query.

    The time taken is also down to the fact you are inserting the results into a table. You could try removing as many fields as possible from tablea and the insert, and only insert the columns you actually need (maybe even just a primary key column) into the table. This is especially likely to speed things up if there are any long character columns in tablea, which need not be included in the insert.

    Tim

    Tim Wilkinson

    "If it doesn't work in practice, you're using the wrong theory"
    - Immanuel Kant

  • quote:


    I wonder if anyone can help me decided the best coding to choose for this job

    Insert into tablec

    Select a.col1, a.col2, b.col1, b.col2 from tablea a join tableb b on a.col3 = b.col3

    Insert into errortable

    Select * from tablea where col1 not in (select col1 from tablec)

    The 'not in' statement is taking a long time as you would expect. Does anyone have any ideas on how to change the statement to speed it up

    All help appreciated


    Nazims query is OK , only that its missing a keyword. try this sone instead.

    Insert into errortable

    Select a.col1, a.col2, a.col3 from tablea a where WHERE NOT EXISTS (select 1 from tablec c where c.col1=a.col1)

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply