NOT EXISTS - tortoise or horse ?

  • All,

    I have often worked on the below requirement.

    There are two tables and fetch the record from TableA which doesn't exists in TableB.

    I use NOTEXISTS or else I use temp table / table variable and include a field called "exists_flag". Based on this value, I will accomplish the requirement.

    CREATE TABLE Orders

    (order_nbr INTEGER NOT NULL PRIMARY KEY,

    some_col DECIMAL (9,2) NOT NULL)

    CREATE TABLE Order_Details

    (order_nbr INTEGER NOT NULL,

    sku INTEGER NOT NULL,

    item_price DECIMAL (9,2) NOT NULL,

    PRIMARY KEY(order_nbr, sku),

    FOREIGN KEY(order_nbr) REFERENCES Orders(order_nbr));

    INSERT INTO Orders VALUES (1, 0)

    INSERT INTO Orders VALUES (2, 0)

    INSERT INTO Orders VALUES (3, 0)

    INSERT INTO Orders VALUES (4, 0)

    INSERT INTO Order_Details VALUES (1, 2, 205.00)

    INSERT INTO Order_Details VALUES (1, 1, 500.00)

    INSERT INTO Order_Details VALUES (2, 1, 490.95)

    INSERT INTO Order_Details VALUES (3, 1, 480.00)

    My Code :

    select * from Orders

    where not exists ( select order_nbr from Order_Details where Orders.order_nbr = Order_Details.order_nbr)

    or

    select * from Orders a left join Order_Details b on a.order_nbr = b.order_nbr

    where b.order_nbr is null

    is there any other way to achieve this? first of all, is there any changes needs to be done on database design to handle this kind of requirement in an elegant way?

    is this the common requiremrnt surivie in the world for long time?

    karthik

  • this is elementary example of "difference between not exists and left join with null filter" so i dont think you will get any kind of difference in "execution plan" or any change in database design.but i often experince that not exists work better in these kind of scenarios when data volume is large. i will come back on this after my lunch again (may be with some stats 😀 )

    -------Bhuvnesh----------
    I work only to learn Sql Server...though my company pays me for getting their stuff done;-)

  • karthik M (12/26/2012)


    All,

    I have often worked on the below requirement.

    There are two tables and fetch the record from TableA which doesn't exists in TableB.

    I use NOTEXISTS or else I use temp table / table variable and include a field called "exists_flag". Based on this value, I will accomplish the requirement.

    CREATE TABLE Orders

    (order_nbr INTEGER NOT NULL PRIMARY KEY,

    some_col DECIMAL (9,2) NOT NULL)

    CREATE TABLE Order_Details

    (order_nbr INTEGER NOT NULL,

    sku INTEGER NOT NULL,

    item_price DECIMAL (9,2) NOT NULL,

    PRIMARY KEY(order_nbr, sku),

    FOREIGN KEY(order_nbr) REFERENCES Orders(order_nbr));

    INSERT INTO Orders VALUES (1, 0)

    INSERT INTO Orders VALUES (2, 0)

    INSERT INTO Orders VALUES (3, 0)

    INSERT INTO Orders VALUES (4, 0)

    INSERT INTO Order_Details VALUES (1, 2, 205.00)

    INSERT INTO Order_Details VALUES (1, 1, 500.00)

    INSERT INTO Order_Details VALUES (2, 1, 490.95)

    INSERT INTO Order_Details VALUES (3, 1, 480.00)

    My Code :

    select * from Orders

    where not exists ( select order_nbr from Order_Details where Orders.order_nbr = Order_Details.order_nbr)

    or

    select * from Orders a left join Order_Details b on a.order_nbr = b.order_nbr

    where b.order_nbr is null

    is there any other way to achieve this? first of all, is there any changes needs to be done on database design to handle this kind of requirement in an elegant way?

    is this the common requiremrnt surivie in the world for long time?

    A left outer join with a NULL exclusion in the WHERE clause will produce an execution plan that shows a full LEFT JOIN followed by a filter that removes the matched values and leaves the unmatched as the return. This means outputting all the rows in the two tables, though, and is inefficient.

    A better way to proceed is to use a subquery, like so:

    SELECT

    *

    FROM

    dbo.Orders AS O

    WHERE

    order_nbr NOT IN (SELECT order_nbr FROM Order_Details AS OD WHERE O.order_nbr = OD.order_nbr);

    This results in a Left Anti Semi Join nested loop that outputs only the rows that do not match, without the need for a filter.

    Roland Alexander 
    The Monday Morning DBA 
    There are two means of refuge from the miseries of life: music and cats. ~ Albert Schweitzer

  • http://sqlinthewild.co.za/index.php/2010/03/23/left-outer-join-vs-not-exists/

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Apology as i had hours-longed lunch 😛 .

    Bhuvnesh (12/26/2012)


    i will come back on this after my lunch again (may be with some stats 😀 )

    So i did the test with

    ----------------------------------

    Order table ~ 5k records

    Order_details ~ 26K records

    ----------------------------------

    i have attached the exec plan ( exec.png) which shown that there is slight difference like addition of stream- aggregate in case of EXISTS query and "FILTER" in case of LEFT JOIN query

    AND below are the stats which also shown slight difference.

    (20 row(s) affected)

    Table 'Order_Details'. Scan count 1, logical reads 90, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Orders'. Scan count 1, logical reads 36, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 32 ms, elapsed time = 31 ms.

    (20 row(s) affected)

    Table 'Order_Details'. Scan count 1, logical reads 90, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Orders'. Scan count 1, logical reads 36, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 31 ms, elapsed time = 26 ms.

    -------Bhuvnesh----------
    I work only to learn Sql Server...though my company pays me for getting their stuff done;-)

  • CELKO (12/26/2012)


    SELECT X.order_nbr

    FROM (SELECT order_nbr FROM Orders

    EXCEPT

    SELECT order_nbr FROM Order_details)

    AS X(order_nbr);

    how can we select remaining columns of Orders table ?

    if we have to join back to "order" then it will increase the "scan count" for this table

    -------Bhuvnesh----------
    I work only to learn Sql Server...though my company pays me for getting their stuff done;-)

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply