September 6, 2018 at 2:29 pm
Same sql in prod runs in 23 min and in QA/test it is 11 min.
Since the QA db is refreshed from prod daily, it's the same database on both.
cpu utilization is about 50-60% on prod, 20% on qa. i believe this is likely the root cause for slower performance.
Plans are the same. see attached.
The biggest cost according to the plan is the index spool eager spool surrounding the biggest table (21 million rows).
1. What else could the cause be that prod runs slower tha QA?
2. Is there a way to improve the plan/get rid of the index/eager spools?
September 6, 2018 at 5:15 pm
steve.gambale - Thursday, September 6, 2018 2:29 PMSame sql in prod runs in 23 min and in QA/test it is 11 min.
Since the QA db is refreshed from prod daily, it's the same database on both.
cpu utilization is about 50-60% on prod, 20% on qa. i believe this is likely the root cause for slower performance.
Plans are the same. see attached.The biggest cost according to the plan is the index spool eager spool surrounding the biggest table (21 million rows).
1. What else could the cause be that prod runs slower tha QA?2. Is there a way to improve the plan/get rid of the index/eager spools?
Part of it is the Cartesian join (cross join). You could try the following, would make it a triangular join and hopefully reduce the number of rows returned.
FROM
(ResultsQuery r
LEFT OUTER JOIN OfacValidation v ON v.DatabaseTableID = r.DatabaseTableID
AND v.TableIdentityValue = r.TableIdentityValue
AND ISNULL(v.FrontierName, '') = ISNULL(r.FrontierName, '')
AND ISNULL(v.SdnName, '') = ISNULL(r.SdnName, ''))
INNER JOIN PercentMatchQuery m
ON r.PercentMatch < m.PercentMatchMinimum
September 7, 2018 at 10:56 am
Thanks.
Removing cross join does remove the index spool eager spool. it runs in about 16 min now in prod.
Not sure if this is what the results should be as the output is much less - will ask developer.
The one difference in the plans are in QA there is a keyword OPTIMIZED, but not in prod. This is for the second largest table in the sql in this batch (this is a series of 5 sql statements), but not where the cross join is.
I looked around and it's a little involved, but if the optimizer believes a benefit for a nested loops join, it may try to reorder the input rows to improve I/O performance
Also, the sqls are using result sets, then are referring to the result sets further down. not sure if that's efficient or not, but they need to run sql, then take that output for the next step (sql). might be better or same performance as doing temporary tables.
September 7, 2018 at 11:02 am
steve.gambale - Friday, September 7, 2018 10:56 AMThanks.
Removing cross join does remove the index spool eager spool. it runs in about 16 min now in prod.
Not sure if this is what the results should be as the output is much less - will ask developer.The one difference in the plans are in QA there is a keyword OPTIMIZED, but not in prod. This is for the second largest table in the sql in this batch (this is a series of 5 sql statements), but not where the cross join is.
I looked around and it's a little involved, but if the optimizer believes a benefit for a nested loops join, it may try to reorder the input rows to improve I/O performanceAlso, the sqls are using result sets, then are referring to the result sets further down. not sure if that's efficient or not, but they need to run sql, then take that output for the next step (sql). might be better or same performance as doing temporary tables.
I can't guarantee what I provided is correct. You didn't provide anything with which to test the code so what I gave you is a best guess. I was attempting to reduce the number of rows of data that were being processed based on what I read in the code, nothing else. Also, some times breaking things down into smaller pieces (divide and conquer) can also improve performance as it allows you to create temporary tables that can be indexed differently than the original source tables to improve performance as well as reducing the size of the result sets being used.
September 10, 2018 at 7:30 am
I cannot see anything from OfacValidation v used anywhere else in the query:SELECT
@CreatedBy_AssociateID AS CreatedBy_AssociateID,
@RunDateTime AS CreatedDateTime,
@CreatedBy_AssociateID AS UpdatedBy_AssociateID,
@RunDateTime AS UpdatedDateTime,
r.DatabaseTableID,
r.TableIdentityValue,
r.ColumnName,
r.FrontierName,
r.SdnName,
r.PercentMatch,
CASE WHEN r.PercentMatch < m.PercentMatchMinimum THEN 222 ELSE NULL END AS ValidatedBy_AssociateID,
CASE WHEN r.PercentMatch < m.PercentMatchMinimum THEN @RunDateTime ELSE NULL END AS ValidatedDateTime,
r.City,
r.StateProvince,
r.Ofac_DatabaseTableID
FROM ResultsQuery r
LEFT OUTER JOIN OfacValidation v ON v.DatabaseTableID = r.DatabaseTableID
AND v.TableIdentityValue = r.TableIdentityValue
AND ISNULL(v.FrontierName, '') = ISNULL(r.FrontierName, '')
AND ISNULL(v.SdnName, '') = ISNULL(r.SdnName, '')
CROSS JOIN PercentMatchQuery m
WHERE v.OfacValidationID IS NULL
I assume, the LEFT JOIN is needed only to check for existence of corresponding records.
If so, then it must be expressed in the query exactly like that - WHERE NOT EXISTS:
SELECT
@CreatedBy_AssociateID AS CreatedBy_AssociateID,
@RunDateTime AS CreatedDateTime,
@CreatedBy_AssociateID AS UpdatedBy_AssociateID,
@RunDateTime AS UpdatedDateTime,
r.DatabaseTableID,
r.TableIdentityValue,
r.ColumnName,
r.FrontierName,
r.SdnName,
r.PercentMatch,
CASE WHEN r.PercentMatch < m.PercentMatchMinimum THEN 222 ELSE NULL END AS ValidatedBy_AssociateID,
CASE WHEN r.PercentMatch < m.PercentMatchMinimum THEN @RunDateTime ELSE NULL END AS ValidatedDateTime,
r.City,
r.StateProvince,
r.Ofac_DatabaseTableID
FROM ResultsQuery r
CROSS JOIN PercentMatchQuery m
WHERE NOT EXISTS (
select * from OfacValidation v
where v.DatabaseTableID = r.DatabaseTableID
AND v.TableIdentityValue = r.TableIdentityValue
AND ISNULL(v.FrontierName, '') = ISNULL(r.FrontierName, '')
AND ISNULL(v.SdnName, '') = ISNULL(r.SdnName, '')
)
_____________
Code for TallyGenerator
September 10, 2018 at 7:45 am
steve.gambale - Thursday, September 6, 2018 2:29 PMSame sql in prod runs in 23 min and in QA/test it is 11 min.
Since the QA db is refreshed from prod daily, it's the same database on both.
cpu utilization is about 50-60% on prod, 20% on qa. i believe this is likely the root cause for slower performance.
Plans are the same. see attached.The biggest cost according to the plan is the index spool eager spool surrounding the biggest table (21 million rows).
1. What else could the cause be that prod runs slower tha QA?2. Is there a way to improve the plan/get rid of the index/eager spools?
Heh.... 50-60% CPU utilization in prod and 20% in QA. That should be a bell ringer right there. Prod is busier than QA and there and even if those other processes in Prod aren't hitting your table, your code has to wait its turn in the time share known as number of cores.
The other thing is that if the QA box is at 20% average CPU, then something is really wrong with the code. QA should be dead quiet compared to Prod because there are so few users in QA compared to Prod. You need to start doing a deep dive into the code, find the top 10 offenders, and fix them. If there's still so much CPU usage, wash, rinse, and repeat for the next 10.
Performance is in the code... or not. 😀
And, yes... there's a way to get rid of "Eager Spools". It's going to take better code.
--Jeff Moden
Change is inevitable... Change for the better is not.
September 10, 2018 at 1:55 pm
thanks for the replies.
These are consolidated sql servers on QA and prod with more than 80 databases on them. one sql server on each server.
That's probably why QA is not less than 20% utilized.
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply