Slower performance in Prod, Plan has Index Spool Eager Spool

  • Same sql in prod runs in 23 min and in QA/test it is 11 min.
    Since the QA db is refreshed from prod daily, it's the same database on both.
    cpu utilization is about 50-60% on prod, 20% on qa.  i believe this is likely the root cause for slower performance.
    Plans are the same.  see attached.

    The biggest cost according to the plan is the index spool eager spool surrounding the biggest table (21 million rows).
    1.  What else could the cause be that prod runs slower tha QA?

    2. Is there a way to improve the plan/get rid of the index/eager spools?

  • steve.gambale - Thursday, September 6, 2018 2:29 PM

    Same sql in prod runs in 23 min and in QA/test it is 11 min.
    Since the QA db is refreshed from prod daily, it's the same database on both.
    cpu utilization is about 50-60% on prod, 20% on qa.  i believe this is likely the root cause for slower performance.
    Plans are the same.  see attached.

    The biggest cost according to the plan is the index spool eager spool surrounding the biggest table (21 million rows).
    1.  What else could the cause be that prod runs slower tha QA?

    2. Is there a way to improve the plan/get rid of the index/eager spools?

    Part of it is the Cartesian join (cross join).  You could try the following, would make it a triangular join and hopefully reduce the number of rows returned.

    FROM
     (ResultsQuery r
      LEFT OUTER JOIN OfacValidation v ON v.DatabaseTableID    = r.DatabaseTableID
                AND v.TableIdentityValue   = r.TableIdentityValue
                AND ISNULL(v.FrontierName, '') = ISNULL(r.FrontierName, '')
                AND ISNULL(v.SdnName, '')  = ISNULL(r.SdnName, ''))
      INNER JOIN PercentMatchQuery m
      ON r.PercentMatch < m.PercentMatchMinimum

  • Thanks.
    Removing cross join does remove the index spool eager spool.  it runs in about 16 min now in prod.
    Not sure if this is what the results should be as the output is much less - will ask developer.

    The one difference in the plans are in QA there is a keyword OPTIMIZED, but not in prod.  This is for the second largest table in the sql in this batch (this is a series of 5 sql statements), but not where the cross join is.
    I looked around and it's a little involved, but if the optimizer believes a benefit for a nested loops join, it may try to reorder the input rows to improve I/O performance

    Also, the sqls are using result sets, then are referring to the result sets further down.  not sure if that's efficient or not, but they need to run sql, then take that output for the next step (sql).  might be better or same performance as doing temporary tables.

  • steve.gambale - Friday, September 7, 2018 10:56 AM

    Thanks.
    Removing cross join does remove the index spool eager spool.  it runs in about 16 min now in prod.
    Not sure if this is what the results should be as the output is much less - will ask developer.

    The one difference in the plans are in QA there is a keyword OPTIMIZED, but not in prod.  This is for the second largest table in the sql in this batch (this is a series of 5 sql statements), but not where the cross join is.
    I looked around and it's a little involved, but if the optimizer believes a benefit for a nested loops join, it may try to reorder the input rows to improve I/O performance

    Also, the sqls are using result sets, then are referring to the result sets further down.  not sure if that's efficient or not, but they need to run sql, then take that output for the next step (sql).  might be better or same performance as doing temporary tables.

    I can't guarantee what I provided is correct.  You didn't provide anything with which to test the code so what I gave you is a best guess.  I was attempting to reduce the number of rows of data that were being processed based on what I read in the code, nothing else.  Also, some times breaking things down into smaller pieces (divide and conquer) can also improve performance as it allows you to create temporary tables that can be indexed differently than the original source tables to improve performance as well as reducing the size of the result sets being used.

  • I cannot see anything from OfacValidation v used anywhere else in the query:
    SELECT
      @CreatedBy_AssociateID  AS CreatedBy_AssociateID,
      @RunDateTime     AS CreatedDateTime,
      @CreatedBy_AssociateID  AS UpdatedBy_AssociateID,
      @RunDateTime     AS UpdatedDateTime,
      r.DatabaseTableID,
      r.TableIdentityValue,
      r.ColumnName,
      r.FrontierName,
      r.SdnName,
      r.PercentMatch,
      CASE WHEN r.PercentMatch < m.PercentMatchMinimum THEN 222 ELSE NULL END AS ValidatedBy_AssociateID,
      CASE WHEN r.PercentMatch < m.PercentMatchMinimum THEN @RunDateTime ELSE NULL END AS ValidatedDateTime,
      r.City,
      r.StateProvince,
      r.Ofac_DatabaseTableID
    FROM ResultsQuery r
      LEFT OUTER JOIN OfacValidation v ON v.DatabaseTableID    = r.DatabaseTableID
                AND v.TableIdentityValue   = r.TableIdentityValue
                AND ISNULL(v.FrontierName, '') = ISNULL(r.FrontierName, '')
                AND ISNULL(v.SdnName, '')  = ISNULL(r.SdnName, '')
      CROSS JOIN PercentMatchQuery m
    WHERE v.OfacValidationID IS NULL

    I assume, the LEFT JOIN is needed only to check for existence of corresponding records.
    If so, then it must be expressed in the query exactly like that - WHERE NOT EXISTS:

    SELECT
      @CreatedBy_AssociateID  AS CreatedBy_AssociateID,
      @RunDateTime     AS CreatedDateTime,
      @CreatedBy_AssociateID  AS UpdatedBy_AssociateID,
      @RunDateTime     AS UpdatedDateTime,
      r.DatabaseTableID,
      r.TableIdentityValue,
      r.ColumnName,
      r.FrontierName,
      r.SdnName,
      r.PercentMatch,
      CASE WHEN r.PercentMatch < m.PercentMatchMinimum THEN 222 ELSE NULL END AS ValidatedBy_AssociateID,
      CASE WHEN r.PercentMatch < m.PercentMatchMinimum THEN @RunDateTime ELSE NULL END AS ValidatedDateTime,
      r.City,
      r.StateProvince,
      r.Ofac_DatabaseTableID
    FROM ResultsQuery r
      CROSS JOIN PercentMatchQuery m
    WHERE NOT EXISTS (
        select * from OfacValidation v
        where v.DatabaseTableID    = r.DatabaseTableID
            AND v.TableIdentityValue   = r.TableIdentityValue
            AND ISNULL(v.FrontierName, '') = ISNULL(r.FrontierName, '')
       AND ISNULL(v.SdnName, '')  = ISNULL(r.SdnName, '')
            )

    _____________
    Code for TallyGenerator

  • steve.gambale - Thursday, September 6, 2018 2:29 PM

    Same sql in prod runs in 23 min and in QA/test it is 11 min.
    Since the QA db is refreshed from prod daily, it's the same database on both.
    cpu utilization is about 50-60% on prod, 20% on qa.  i believe this is likely the root cause for slower performance.
    Plans are the same.  see attached.

    The biggest cost according to the plan is the index spool eager spool surrounding the biggest table (21 million rows).
    1.  What else could the cause be that prod runs slower tha QA?

    2. Is there a way to improve the plan/get rid of the index/eager spools?

    Heh.... 50-60% CPU utilization in prod and 20% in QA.  That should be a bell ringer right there.  Prod is busier than QA and there and even if those other processes in Prod aren't hitting your table, your code has to wait its turn in the time share known as number of cores.

    The other thing is that if the QA box is at 20% average CPU, then something is really wrong with the code.  QA should be dead quiet compared to Prod because there are so few users in QA compared to Prod.  You need to start doing a deep dive into the code, find the top 10 offenders, and fix them.  If there's still so much CPU usage, wash, rinse, and repeat for the next 10.

    Performance is in the code... or not. 😀

    And, yes... there's a way to get rid of "Eager Spools".  It's going to take better code.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • thanks for the replies.
    These are consolidated sql servers on QA and prod with more than 80 databases on them.  one sql server on each server.
    That's probably why QA is not less than 20% utilized.

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply