WHERE col NOT IN (SELECT...) not working

  • I have a parent child relation between to tables. Not all parents have children, some parents have 1 or more children. I want all of the Parent rows that do not have any Child rows.

    There are about 1 million Parent rows, and about 500 thousand Child rows, so I am sure there are at least 500 thousand Parent rows (there are several more because some Parents have multiple children). I am sure that my query is correct, it is in the form;

    SELECT *

    FROM Parent

    WHERE ParentID NOT IN (SELECT ParentID FROM Child)

    But, I am getting 0 results. My query plan follows as best I can visualize it here;

    SELECT <-- HASH Match (Right Anti Semi Join) <-- Index Scan (Child non-clustered Index)

    ^-- Nested Loops (Left Anti Semi Join [Warning: No Join Predicate]) <-- Table Scan (Parent)

    ^-- Row Count Spool (Lazy Spool) <-- Clustered Index Seek (Child)

    My query plan seems to show a broken query plan, but I have no clue what is causing it to choose this plan nor how to fix the plan.

    Any suggestions would be greatly appreciated!

    Dennis

  • Dennis D. Allen (1/21/2009)


    I have a parent child relation between to tables. Not all parents have children, some parents have 1 or more children. I want all of the Parent rows that do not have any Child rows.

    There are about 1 million Parent rows, and about 500 thousand Child rows, so I am sure there are at least 500 thousand Parent rows (there are several more because some Parents have multiple children). I am sure that my query is correct, it is in the form;

    SELECT *

    FROM Parent

    WHERE ParentID NOT IN (SELECT ParentID FROM Child)

    But, I am getting 0 results. My query plan follows as best I can visualize it here;

    SELECT <-- HASH Match (Right Anti Semi Join) <-- Index Scan (Child non-clustered Index)

    ^-- Nested Loops (Left Anti Semi Join [Warning: No Join Predicate]) <-- Table Scan (Parent)

    ^-- Row Count Spool (Lazy Spool) <-- Clustered Index Seek (Child)

    My query plan seems to show a broken query plan, but I have no clue what is causing it to choose this plan nor how to fix the plan.

    Any suggestions would be greatly appreciated!

    Dennis

    Try this:

    select

    p.*

    from

    Parent p

    left outer join Child c

    on (p.ParentID = c.ParentID)

    where

    c.ParentID is null

    Let us know if this works.

  • Depending on your indexes and such this could be rather a bit on the long side but why not something like the following???

    SELECT Parent.*

    FROM Parent Left Join Child ON Parent.ParentID = Child.ParentID

    WHERE child.ParentID IS NULL

    To help us help you read this[/url]For better help with performance problems please read this[/url]

  • One of these days I'll learn to recheck the original post before I post the same answer that's already been suggested a minute or two ago...

    -Luke.

    To help us help you read this[/url]For better help with performance problems please read this[/url]

  • Use the Force Luke...

    Actually, nothing wrong with posting an answer that's already been posted. One thing you'll see is a difference in formating styles.

  • Yeah, I used to run all my posts through a SQL formatter, but My HIPS software doesn't like any of the online ones anymore (keeps thinking I'm trying to run a SQL injection attack against them) so I just went with a copy and paste from the OP and added the appropriate syntax. Typically I'd end up having mine look eerily similar to yours, at least in my production code.

    -Luke.

    To help us help you read this[/url]For better help with performance problems please read this[/url]

  • Format by hand do I.

  • Greetings Lynn,

    Yes that does work and is a fine workaround.

    I want to understand why this does not work as it should work correctly and the fact that it is not might point to an error in my system that I would like to find and fix. Also, I am trying to learn to read the execution plan better.

    Any ideas?

  • Format by SQL Prompt I do, but what can I say I used to like being able to post in color... makes it much easier to read while answering posts...

    -Luke.

    To help us help you read this[/url]For better help with performance problems please read this[/url]

  • Dennis D. Allen (1/21/2009)


    Greetings Lynn,

    Yes that does work and is a fine workaround.

    I want to understand why this does not work as it should work correctly and the fact that it is not might point to an error in my system that I would like to find and fix. Also, I am trying to learn to read the execution plan better.

    Any ideas?

    Hi Dennis, the reason is really simple. When you run the query:

    SELECT ParentID FROM Child;

    You are returning at least one row where the ParentID is NULL. That forces the IN statement to evaluate to UNKNOWN which is not TRUE and you get no rows.

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • Dennis D. Allen (1/21/2009)


    Greetings Lynn,

    Yes that does work and is a fine workaround.

    I want to understand why this does not work as it should work correctly and the fact that it is not might point to an error in my system that I would like to find and fix. Also, I am trying to learn to read the execution plan better.

    Any ideas?

    Not that what I gave you is a workaround. I actually think that it is more efficient than using NOT IN. The execution plan showed a nested loop. Think about this for a moment. For each row in Parent, you have to loop through 1,000,000 values returned from Child before knowing if it does not exist in Child. Not very efficient.

    Now, the left outer join returns all records from Parent regardless of a matching record in Child, and you then filter on the Child ParentID, which will be null if there was no match to the ParentID from Parent.

  • Jeffrey Williams (1/21/2009)


    Hi Dennis, the reason is really simple. When you run the query:

    SELECT ParentID FROM Child;

    You are returning at least one row where the ParentID is NULL. That forces the IN statement to evaluate to UNKNOWN which is not TRUE and you get no rows.

    That was indeed the issue. Since the child table should not allow NULLs in the ParentID column, I never thought to look for that.

    Thank you so much!

    Dennis

  • Lynn Pettis (1/21/2009)


    Not that what I gave you is a workaround. I actually think that it is more efficient than using NOT IN. The execution plan showed a nested loop. Think about this for a moment. For each row in Parent, you have to loop through 1,000,000 values returned from Child before knowing if it does not exist in Child. Not very efficient.

    Now, the left outer join returns all records from Parent regardless of a matching record in Child, and you then filter on the Child ParentID, which will be null if there was no match to the ParentID from Parent.

    I intended workaround in the sense that the technique I was employing was causing unexpected results, and using a different technique was avoiding finding the cause of the unexpected behavior.

    Thank you for pointing out the efficiency gain in your technique. In comparing the executing plan and both the IO and TIME statistics, it is evident you are correct. Your query produces a 50% reduction in table scans on the Child and 15% reduction in logical reads on the child and a 15% gain in execution time.

    Thank you Lynn 🙂

  • Lynn Pettis (1/21/2009)


    Use the Force Luke...

    Doggone It! I've been waiting for weeks for an excuse to use that line on Luke!

    😎

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Ya know, I'd say that "I'd never join you" in my whiniest text possible, however who knows what the future will bring...

    And I've been wondering for weeks why we have 2 Vaders running around here, or are you and Perry one in the same?

    -Luke.

    To help us help you read this[/url]For better help with performance problems please read this[/url]

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply