Need help with figuring out query performance problems

  • We have a stored procedure that searches products based on a number of input parameters that differ from one scenario to the next. Depending on the input parameters the search involves anywhere from two to about a dozen different tables. In order to avoid unnecessary joins we build the actual search query as dynamic SQL and execute it inside the stored procedure.

    In one of the most basic scenarios the user searches products by a keyword alone (see Query 1 below), which usually takes less than a second. However, if they search by a keyword and department (Query 2 below), the execution time goes up to well over a minute, and the execution plan looks somewhat different (the attached snapshots of the plans are showing just the parts that differ).

    Query 1 (fast)

    SELECT DISTINCT

    Product.ProductID, Product.Title

    FROM

    Product

    INNER JOIN ProductVariant ON (ProductVariant.ProductID = Product.ProductID)

    WHERE (1=1)

    AND (CONTAINS((Product.*), @Keywords) OR CONTAINS((ProductVariant.*), @Keywords))

    AND (Product.SourceID = @SourceID)

    AND (Product.ProductStatus = @ProductStatus)

    AND (ProductVariant.ProductStatus = @ProductStatus)

    Query 2 (slow)

    SELECT DISTINCT

    Product.ProductID, Product.Title

    FROM

    Product

    INNER JOIN ProductVariant ON (ProductVariant.ProductID = Product.ProductID)

    WHERE (1=1)

    AND (CONTAINS((Product.*), @Keywords) OR CONTAINS((ProductVariant.*), @Keywords))

    AND (Product.SourceID = @SourceID)

    AND (Product.DepartmentID = @DepartmentID)

    AND (Product.ProductStatus = @ProductStatus)

    AND (ProductVariant.ProductStatus = @ProductStatus)

    Both the Product and ProductVariant table have some string columns that participate in the full-text index. The Product table has a non-clustered indexed on the SourceID column and another non-clustered indexed on SourceID+DepartmentID (this redundancy is not an oversight but is intended). ProductVariant.ProductID is a FK to Product and has a non-clustered index on it. Statistics are updated for all indexes and columns, and no missing indexes are reported by SQL Management Studio.

    Any suggestions on what might be causing this drastically different performance?

  • Adding myself to thread so I can see replies

  • Someone on another site guessed that perhaps SQL Server was executing full-text search for every row returned by the other branch of the execution plan and suggested to try breaking the full-text search logic out into its own query and to use a temp table to filter the results of the main query. It made sense to me, so here's the reworked code that returns results in just 2 seconds:

    SELECT

    [Key] AS ProductID

    INTO

    #matchingProducts

    FROM

    CONTAINSTABLE(Product, *, @Keywords)

    SELECT

    [Key] AS VariantID

    INTO

    #matchingVariants

    FROM

    CONTAINSTABLE(ProductVariant, *, @Keywords)

    SELECT DISTINCT

    Product.ProductID, Product.Title

    FROM

    Product

    INNER JOIN ProductVariant ON (ProductVariant.ProductID = Product.ProductID)

    LEFT OUTER JOIN #matchingProducts ON #matchingProducts.ProductID = Product.ProductID

    LEFT OUTER JOIN #matchingVariants ON #matchingVariants.VariantID = ProductVariant.VariantID

    WHERE (1=1)

    AND (Product.SourceID = @SourceID)

    AND (Product.ProductStatus = @ProductStatus)

    AND (ProductVariant.ProductStatus = @ProductStatus)

    AND (Product.DepartmentID = @DepartmentID)

    AND (NOT #matchingProducts.ProductID IS NULL OR NOT #matchingVariants.VariantID IS NULL)

    While this certainly works, I wonder if there's a better way to explicitly tell SQL Server to use a more efficient plan.

  • Would help if you posted the DDL for the tables including indexes and uploaded the actual execution plans instead of just posting pictures.

  • Curiously, when I tried to simplify the above solution using nested queries as shown below, the results were somewhere in-between in terms of speed (about 25 secs). Theoretically, the query below should be identical to the previous one that uses temp tables, yet somehow SQL Server internally compiles and optimizes the second one differently.

    SELECT DISTINCT

    Product.ProductID, Product.Title

    FROM

    Product

    INNER JOIN ProductVariant ON (ProductVariant.ProductID = Product.ProductID)

    LEFT OUTER JOIN

    (

    SELECT

    [Key] AS ProductID

    FROM

    CONTAINSTABLE(Product, *, @Keywords)

    ) MatchingProducts

    ON MatchingProducts.ProductID = Product.ProductID

    LEFT OUTER JOIN

    (

    SELECT

    [Key] AS VariantID

    FROM

    CONTAINSTABLE(ProductVariant, *, @Keywords)

    ) MatchingVariants

    ON MatchingVariants.VariantID = ProductVariant.VariantID

    WHERE (1=1)

    AND (Product.SourceID = @SourceID)

    AND (Product.ProductStatus = @ProductStatus)

    AND (ProductVariant.ProductStatus = @ProductStatus)

    AND (Product.DepartmentID = @DepartmentID)

    AND (NOT MatchingProducts.ProductID IS NULL OR NOT MatchingVariants.VariantID IS NULL)

  • Lynn Pettis: I am attaching the files. Please let me know if you need anything else. Thanks!

  • Caspian Canuck (12/21/2015)


    Curiously, when I tried to simplify the above solution using nested queries as shown below, the results were somewhere in-between in terms of speed (about 25 secs). Theoretically, the query below should be identical to the previous one that uses temp tables, yet somehow SQL Server internally compiles and optimizes the second one differently.

    SELECT DISTINCT

    Product.ProductID, Product.Title

    FROM

    Product

    INNER JOIN ProductVariant ON (ProductVariant.ProductID = Product.ProductID)

    LEFT OUTER JOIN

    (

    SELECT

    [Key] AS ProductID

    FROM

    CONTAINSTABLE(Product, *, @Keywords)

    ) MatchingProducts

    ON MatchingProducts.ProductID = Product.ProductID

    LEFT OUTER JOIN

    (

    SELECT

    [Key] AS VariantID

    FROM

    CONTAINSTABLE(ProductVariant, *, @Keywords)

    ) MatchingVariants

    ON MatchingVariants.VariantID = ProductVariant.VariantID

    WHERE (1=1)

    AND (Product.SourceID = @SourceID)

    AND (Product.ProductStatus = @ProductStatus)

    AND (ProductVariant.ProductStatus = @ProductStatus)

    AND (Product.DepartmentID = @DepartmentID)

    AND (NOT MatchingProducts.ProductID IS NULL OR NOT MatchingVariants.VariantID IS NULL)

    With the temporary tables you're forcing materialization of the intermediate result sets; put another way, you're forcing portions of the query to be processed in a particular order. Subqueries (and CTEs and views) do not necessarily work that way. The optimizer is free to transform such a query into something logically equivalent that does not access the underlying tables or apply your filters in the order you think you're specifying, so long as it thinks the alternative is cheaper.

    That's all part of T-SQL being declarative instead of procedural, and usually works out well. Sometimes, though, the optimizer just gets that wrong, and it's helpful to force portions of the query to be processed a particular order (as with piping intermediate results to temp tables). There's a long-standing still-active connect item requesting the ability to force that behavior (https://connect.microsoft.com/SQLServer/feedback/details/218968/provide-a-hint-to-force-intermediate-materialization-of-ctes-or-derived-tables).

    At any rate, it's important to know that using subqueries/views/CTEs does not force those result sets to be materialized first. Usually that's wonderful because you can write queries in more readable ways without forcing SQL Server to process the query in the order written; after all, the query whose written form most explicitly maps to the most efficient execution plan might be well-nigh unreadable, and a query plan that most explicitly maps to a readable written form might be terribly inefficient. It's nice to let the optimizer worry about those sorts of transformations when you can get away with it 🙂

    Cheers!

  • Jacob, thank you for the great explanation! 🙂

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply