Re: Short-circuiting a query

  • Is there anyway to "short-circuit" a query? The following query generates the same "actual execution plan" regardless of the @brand_id value (null or non-null value):

    DECLARE @brand_id INT;

    SET @brand_id = null;

    SELECT pk_product_id

    FROM tbl_products

    WHERE @brand_id IS NULL OR fk_brand_id IN (SELECT pk_brand_id

    FROMtbl_brands AS b WITH (NOLOCK)

    WHEREb.fk_brand_map_id = @brand_id

    ORb.pk_brand_id = @brand_id)

    Ideally, if @brand_id IS NULL, I don't want to evaulate the inner query (SELECT ... FROM tbl_brands)

  • This should do the trick:

    [font="Courier New"]DECLARE @brand_id INT

    IF @brand_id IS NOT NULL

    BEGIN  

       SELECT p.pk_product_id

       FROM tbl_brands b (NOLOCK)

       INNER JOIN tbl_products p ON p.fk_brand_id = b.pk_brand_id

       WHERE b.fk_brand_map_id = @brand_id OR b.pk_brand_id = @brand_id)

    END

    ELSE

    BEGIN

       SELECT pk_product_id

       FROM tbl_products

       WHERE 0 = 1

    END

    [/font]

    Please note that the code is untested.

    Cheers

    ChrisM

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • jlp3630 (9/29/2008)


    Is there anyway to "short-circuit" a query? The following query generates the same "actual execution plan" regardless of the @brand_id value (null or non-null value):

    DECLARE @brand_id INT;

    SET @brand_id = null;

    SELECT pk_product_id

    FROM tbl_products

    WHERE @brand_id IS NULL OR fk_brand_id IN (SELECT pk_brand_id

    FROMtbl_brands AS b WITH (NOLOCK)

    WHEREb.fk_brand_map_id = @brand_id

    ORb.pk_brand_id = @brand_id)

    Ideally, if @brand_id IS NULL, I don't want to evaulate the inner query (SELECT ... FROM tbl_brands)

    The better approach might be to put an IF clause in so that you validate the variable prior to running the query. Then you don't have to use an OR statement either. Plus, you can do a JOIN rather than the IN statement, which is going to run very poorly in most situations. Something like this (untested):

    IF NOT (@brand_id IS NULL)

    BEGIN

    SELECT p.pk_product_id

    FROM tbl_products AS p

    JOIN tbl_brands AS b

    ON p.fk_brand_id = b.pk_brand_id

    WHERE b.fk_brand_map_id = @brand_id

    OR b.pk_brand_id = @brand_id

    END

    But that second OR is still going to cause you some difficulty.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Spooky, Grant. Very spooky 😎

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Funny that we both fixed the JOIN as well as putting in the IF.

    Well, normally I'd say to stay out of my head, but since you posted first, I'll try to stay out of yours in the future.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Are there any situations (resultset or performance) where using IN is preferred over JOIN?

  • It depends...

    IN basically is an OR statement. It's one of those hidden RBAR (row-by-agonizing-row, TM: Jeff Moden) operations because it's going to check each row of the IN clause against all the rows of the SELECT statement. If both sets of data are fairly small, three rows to three rows for example, then it's not necessarily a problem and it might even perform faster than using a JOIN (maybe). Circumstances like that are fairly rare. Much more common is to see hundreds or thousands of rows being compared to hundreds or thousands of rows... one... at... a.... time... Which, I'm sure you can imagine, isn't a good thing.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Grant Fritchey (9/30/2008)


    It depends...

    IN basically is an OR statement. It's one of those hidden RBAR (row-by-agonizing-row, TM: Jeff Moden) operations because it's going to check each row of the IN clause against all the rows of the SELECT statement.

    Not necessarily. Most of the time the optimiser will convert a subquery (even a correlated one) into a join. In this case, the subquery isn't correlated (no reference to the outer query) so there's no reason it has to be run once per row of the outer.

    Consider these two (adventureworks)

    select * from Sales.SalesOrderDetail where ProductID in (select ProductID from Production.Product where ListPrice > 100)

    select sod.* from Sales.SalesOrderDetail sod inner join Production.Product p on sod.ProductID = p.ProductID where ListPrice > 100

    They have an identical output and an identical execution plan. If you run via SQL 2008 management studio (so you can see the number of executes that occur per operator) they both show 1 execution for both the sales order detail and the products tables.

    If they're run with stats IO on, I get the following logical reads for both.

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0.

    Table 'SalesOrderDetail'. Scan count 1, logical reads 1240, physical reads 0.

    Table 'Product'. Scan count 1, logical reads 15, physical reads 0.

    I spent the entire weekend trying to get an example of a correlated subquery where the subquery was run once per row of the outer query, and I failed. Every query I tried, the optimiser managed to convert the subquery into a join that was run only once.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • I must be regurgitating bad stuff again. Sorry. It's when it's a delimited list that it behaves like that, right?

    However, I did a little looking around IN clauses don't work as well as JOINs. Here's an example (cleaned up from here: http://tgrignon.wordpress.com/2008/01/21/sql-server-performance-and-the-in-clause/ ):

    SELECT DISTINCT

    c.LastName

    ,c.FirstName

    FROM Person.Person c

    JOIN HumanResources.Employee e

    ON e.BusinessEntityId = c.BusinessEntityID

    WHERE e.BusinessEntityID IN (

    SELECT SalesPersonID

    FROM Sales.SalesOrderHeader

    WHERE SalesOrderID IN (

    SELECT SalesOrderID

    FROM Sales.SalesOrderDetail

    WHERE ProductID IN (SELECT ProductID

    FROM Production.Product p

    WHERE ProductNumber LIKE 'FW%'))) ;

    SELECT DISTINCT

    c.LastName

    ,c.FirstName

    FROM Person.Person c

    JOIN HumanResources.Employee e

    ON e.BusinessEntityId = c.BusinessEntityId

    JOIN Sales.SalesOrderHeader AS soh

    ON e.BusinessEntityID = soh.SalesPersonID

    JOIN Sales.SalesOrderDetail AS sod

    ON soh.SalesOrderID = sod.SalesOrderID

    JOIN Production.Product AS p

    ON sod.ProductID = p.ProductID

    WHERE p.ProductNumber LIKE 'FW%'

    Now you get quite different plans and quite different performance.

    I also this bit on issues around the lazy spool using NOT IN: http://sql-server-performance.com/Community/forums/p/2422/13885.aspx#13885

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Grant Fritchey (9/30/2008)


    I must be regurgitating bad stuff again. Sorry. It's when it's a delimited list that it behaves like that, right?

    You mean something like this?

    select ProductID from Sales.SalesOrderDetail where ProductID in (717,718,719,722,723,725,726,727,729,730,732,733,736,738,739,741,742,743,744,745)

    However, I did a little looking around IN clauses don't work as well as JOINs. Here's an example (cleaned up from here: http://tgrignon.wordpress.com/2008/01/21/sql-server-performance-and-the-in-clause/ ):

    <snip>

    Now you get quite different plans and quite different performance.

    Oh, there are cases where that happens, correlated and noncorrelated subqueries. It's just not a given for every case that involves a subquery of any form.

    And even in that example, the subquery is not executed row by row. In the exec plan I get, the seek on product only executes once. The seek on Sales order detail runs 16 times, but that's because of the nested loop join to product (Sales Order Detail is the inner table in the loop) and the 16 rows returned from product.

    If the subqueries were run once per row of the outer query, we would expect 3806 reads of SalesOrderDetail, because that's the number of rows returned by salesorderheader, and a really large number of reads of product.

    I also this bit on issues around the lazy spool using NOT IN: http://sql-server-performance.com/Community/forums/p/2422/13885.aspx#13885

    Interesting, thanks.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Are you kidding, thank you. I hate posting bad information.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Don't we all.

    Oh, and on SQL 2008, I get the following results for those two queries:

    Subqueries: (35% cost)

    Table 'Person'. Scan count 0, logical reads 45

    Table 'Worktable'. Scan count 0, logical reads 0

    Table 'SalesOrderDetail'. Scan count 16, logical reads 34

    Table 'Product'. Scan count 1, logical reads 2

    Table 'SalesOrderHeader'. Scan count 290, logical reads 641

    Table 'Employee'. Scan count 1, logical reads 2

    SQL Server Execution Times:

    CPU time = 15 ms, elapsed time = 195 ms.

    Joins: (65% cost)

    Table 'Person'. Scan count 0, logical reads 2615

    Table 'Worktable'. Scan count 0, logical reads 0

    Table 'SalesOrderHeader'. Scan count 1, logical reads 57

    Table 'SalesOrderDetail'. Scan count 16, logical reads 34

    Table 'Product'. Scan count 1, logical reads 2

    Table 'Employee'. Scan count 1, logical reads 2

    SQL Server Execution Times:

    CPU time = 16 ms, elapsed time = 344 ms.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Grant Fritchey (9/29/2008)

    IF NOT (@brand_id IS NULL)

    BEGIN

    SELECT p.pk_product_id

    FROM tbl_products AS p

    JOIN tbl_brands AS b

    ON p.fk_brand_id = b.pk_brand_id

    WHERE b.fk_brand_map_id = @brand_id

    OR b.pk_brand_id = @brand_id

    END

    But that second OR is still going to cause you some difficulty.

    Assuming I have indices on b.fk_brand_map_id and b.pk_brand_id, would using the UNION operator get around the problem of the second OR.

    SELECT p.pk_product_id

    FROM tbl_products AS p

    JOIN tbl_brands AS b

    ON p.fk_brand_id = b.pk_brand_id

    WHERE b.fk_brand_map_id = @brand_id

    UNION

    SELECT p.pk_product_id

    FROM tbl_products AS p

    JOIN tbl_brands AS b

    ON p.fk_brand_id = b.pk_brand_id

    WHERE b.pk_brand_id = @brand_id

  • Maybe. It's worth testing, however in 2005 and higher, the optimiser has the same options for optimising an 'or' as it has optimising a 'union', and there's a good chance that the two will have a very similar plan.

    The conversion or an or to a union was a common trick in SQL 2000, as the optimiser had several ways to do a 'union' and much fewer for an 'or'

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply