Let the optimizer do it's thing -- wrong

Question

Let the optimizer do it's thing -- wrong

Indianrock

SSC-Insane

Points: 20333
More actions
December 9, 2016 at 6:40 am

#319620

I've heard this advice everywhere. Don't do things to change the way the sql optimizer produces plans etc. "it's smart so don't use plan guides, query hints, force recompile etc etc etc etc"
It reminds me of how the cost threshold for parallelism setting, out of the box, is years out of date due to hardware improvements. Today, many shops are using object-relational-mapper ORM technology to produce sql statements.
Is this good? -- often it's horrible. Are we going back to stored procedures? -- not likely.
Developers are using ORM technology and for many reasons we're stuck with it -- such as,
1)it would cost a fortune to find and hire developers who could learn today's complicated business rules, c#, .NET AND be sql gurus
2) today's DBAs are often not involved in the extensive meetings to hash out the business logic for today's applications, so we couldn't necessarily provide stored procedures anyway.
With the combination of ORM sql and in some cases like ours, significant skewing in the data ( cardinality), sql often times out on query compilation and chooses a bad plan ( e.g. a plan that worked for a tiny client with 10k records but not good for another client with data in the same large tables who haS 50 million records )
The articles telling you not to use tweaks often show a sample query like
select name,price from orders where name='Smith' I saw something like that when researching plan guides, then balked when considering placing something like the query below into a plan guide.
So I'm looking at filtered statistics, hints to force index use and, what seems to be the magic wand for a subset of problem queries -- OPTION (RECOMPILE). Of course hints that are easy to add ourselves aren't quite as easy to add to specific ORM-generated queries.
Now I'm beginning to understand why we've needed to run dbcc freeproccache weekly. The optimizer is only going to spend so much time on compilation -- as Kimberley Tripp points out in this video, sql looks for "a good plan, fast" -- NOT A PERFECT PLAN. She discusses filtered statistics, but in my testing, new optimizer-suggested high value indexes and filtered statistics don't necessarily even get used if the query is complex and sql is left to it's own devices.
https://youtu.be/li5HwaZF8tc?list=PLoGAcXKPcRvbTr23ujEN953pLP_nDyZJC
SELECT DISTINCT TOP 2000
CollateralGroupRequest02.SERVICE_REQUEST_ID AS PrimaryKey,
Client18.SHORT_NAME AS ClientShortName,
EeeReason19.DESCRIPTION AS EeeReason,
AccountProperty110.MODIFIED_MANUFACTURER_ID AS VIN,
Account15.CATEGORY AS AccountType,
FollowupEntity111.ENTITY_CODE AS DealerID,
FollowupEntity111.NORMALIZED_ENTITY_CODE AS FollowupEntity111_NORMALIZED_ENTITY_CODE58,
Account15.FINANCED_DATE AS FinancedDate,
Account15.BOOKED_DATE AS BookedDate,
AccountProperty110.EXPECTED_TITLING_STATE_ABBR AS ExpectedTitlingState,
BusinessUnit112.LONG_NAME AS BusinessUnit,
BusinessUnit112.SHORT_NAME AS BusinessUnitId,
CollateralGroupRequest02.SERVICE_REQUEST_STATUS AS Status,
AccountOwnershipDocSummary16.STATUS AS AccountStatus,
AccountOwnershipDocSummary16.BORROWER_FULL_NAMES AS BorrowerFullNames,
Account15.CUSTOM_ATTRIBUTE_1 AS AccountNumber,
Account15.CUSTOM_ATTRIBUTE_2 AS LoanNumber,
Account15.CUSTOM_ATTRIBUTE_3 AS LoanSuffix,
Account15.CUSTOM_ATTRIBUTE_4 AS Branch,
CollateralGroupRequest02.EEE_DATE AS EEEDate,
CollateralGroupRequest02.EEE_CAUSE AS CauseOfEEE,
CollateralGroupRequest02.REQUEST_TRANSACTION_TYPE AS RequestType,
Jurisdiction113.SHORT_NAME AS TMState,
Account15.RECOVERY_STATUS AS RecoveryCode,
Account15.USER_DEFINED_1 AS UserDef1,
Account15.USER_DEFINED_2 AS UserDef2,
Account15.USER_DEFINED_3 AS UserDef3,
LienholderStatusCode114.STATUS_CODE AS LienholderStatus,
AccountProperty110.VEHICLE_TYPE AS CollateralType,
Account15.SUB_CATEGORY AS AccountSubType,
(SELECT
DerivedTable01_11
FROM (SELECT
MIN(ReminderWorkItem02.REMIND_DATE) AS DerivedTable01_11
FROM WORK_QUEUE_ITEM AS ReminderWorkItem02
WHERE ((ReminderWorkItem02.ACCOUNT_ID = Account15.ACCOUNT_ID
AND ReminderWorkItem02.BUSINESS_PROCESS_STATUS = 'Open'
AND ReminderWorkItem02.NAME = 'REMINDER'
AND ReminderWorkItem02.SECURED_ORGANIZATIONORGANIZATION_ID = Account15.CLIENT_ID)
AND ((ReminderWorkItem02.CONCRETE_TYPE IN ('Fdi.Workflow.Po.ReminderWorkItem'))))) AS ScalarQueryTable)
AS ReminderDate,
(SELECT
ExternalUser13_USERNAME13
FROM (SELECT TOP 1
ExternalUser13.USERNAME AS ExternalUser13_USERNAME13,
ReminderWorkItem02.REMIND_DATE AS ReminderWorkItem02_REMIND_DATE1
FROM WORK_QUEUE_ITEM AS ReminderWorkItem02
INNER JOIN USR AS ExternalUser13
ON ReminderWorkItem02.ASSIGNED_USER_ID = ExternalUser13.USR_ID
WHERE ((ReminderWorkItem02.ACCOUNT_ID = Account15.ACCOUNT_ID
AND ReminderWorkItem02.BUSINESS_PROCESS_STATUS = 'Open'
AND ReminderWorkItem02.NAME = 'REMINDER'
AND ReminderWorkItem02.SECURED_ORGANIZATIONORGANIZATION_ID = Account15.CLIENT_ID)
AND ((ReminderWorkItem02.CONCRETE_TYPE IN ('Fdi.Workflow.Po.ReminderWorkItem'))))
ORDER BY 2) AS ScalarQueryTable)
AS ReminderUser
FROM SERVICE_REQUEST AS CollateralGroupRequest02
INNER JOIN (SERVICED_COLLATERAL_GROUP_ITEM AS ServicedCollateralGroupItem13
INNER JOIN (SERVICED_COLLATERAL_GROUP_ITEM AS ServicedAccount14
INNER JOIN (ACCOUNT AS Account15
INNER JOIN ACCOUNT_OWNERSHIP_DOC_SUMMARY AS AccountOwnershipDocSummary16
ON Account15.ACCOUNT_ID = AccountOwnershipDocSummary16.ACCOUNT_ID
INNER JOIN PROPERTY AS AccountProperty110
ON Account15.ACCOUNT_ID = AccountProperty110.ACCOUNT_ID
LEFT OUTER JOIN LEGAL_ENTITY AS FollowupEntity111
ON Account15.FOLLOWUP_ENTITYLEGAL_ENTITY_ID = FollowupEntity111.LEGAL_ENTITY_ID
LEFT OUTER JOIN BUSINESS_UNIT AS BusinessUnit112
ON Account15.BUSINESS_UNIT_ID = BusinessUnit112.BUSINESS_UNIT_ID
LEFT OUTER JOIN LIENHOLDER_STATUS_CODE AS LienholderStatusCode114
ON Account15.LIENHOLDER_STATUS_CODE_ID = LienholderStatusCode114.LIENHOLDER_STATUS_CODE_ID)
ON ServicedAccount14.ACCOUNT_ID = Account15.ACCOUNT_ID)
ON ServicedCollateralGroupItem13.SERVICED_COLLATERAL_GROUP_ITEM_ID = ServicedAccount14.SERVICED_COLLATERAL_GROUP_ITEM_ID)
ON CollateralGroupRequest02.SERVICE_REQUEST_ID = ServicedCollateralGroupItem13.COLLATERAL_GROUP_REQUESTSERVICE_REQUEST_ID
INNER JOIN ORGANIZATION AS Client18
ON CollateralGroupRequest02.CLIENT_ID = Client18.ORGANIZATION_ID
LEFT OUTER JOIN EEE_REASON AS EeeReason19
ON CollateralGroupRequest02.EEE_REASON_ID = EeeReason19.EEE_REASON_ID
INNER JOIN ORGANIZATION AS Jurisdiction113
ON CollateralGroupRequest02.JURISDICTION_ID = Jurisdiction113.ORGANIZATION_ID
WHERE ((CollateralGroupRequest02.CLIENT_ID = 11330
AND Account15.CLIENT_ID = 11330
AND CollateralGroupRequest02.EEE_DATE IS NOT NULL
AND ((CollateralGroupRequest02.CONCRETE_TYPE = 'Fdi.Po.FollowUpRequest'
AND CollateralGroupRequest02.BUSINESS_PROCESS_STATUS IN ('Open', 'Closed')
AND AccountOwnershipDocSummary16.STATUS = 'NO_TITLE_PM')
OR (CollateralGroupRequest02.CONCRETE_TYPE = 'Fdi.Po.TitleMaintenanceRequest'
AND CollateralGroupRequest02.BUSINESS_PROCESS_STATUS = 'Open'
AND AccountOwnershipDocSummary16.STATUS = 'TITLE_MAINTENANCE_REQUEST_SENT')
OR (CollateralGroupRequest02.CONCRETE_TYPE = 'Fdi.Po.DuplicateTitleRequest'
AND CollateralGroupRequest02.BUSINESS_PROCESS_STATUS = 'Closed')
OR (CollateralGroupRequest02.CONCRETE_TYPE = 'Fdi.Po.DirectLendingServiceRequest'
AND CollateralGroupRequest02.BUSINESS_PROCESS_STATUS = 'Open'
AND AccountOwnershipDocSummary16.STATUS = 'NO_TITLE_PM'
OR (CollateralGroupRequest02.SERVICE_REQUEST_STATUS = 'DocumentsReturned'
AND CollateralGroupRequest02.BUSINESS_PROCESS_STATUS = 'Closed'
AND AccountOwnershipDocSummary16.STATUS = 'NO_TITLE_PM'))
OR (CollateralGroupRequest02.CONCRETE_TYPE = 'Fdi.Po.AdHocRequest'
AND CollateralGroupRequest02.BUSINESS_PROCESS_STATUS = 'Open'))
AND (Account15.CUSTOM_ATTRIBUTE_4 = '01102')))
ORDER BY 20

Viewing 15 posts - 1 through 15 (of 56 total)

You must be logged in to reply to this topic. Login to reply

sgmunson SSC Guru Points: 110639 More actions · Answer 1

Indianrock (12/9/2016)
I've heard this advice everywhere. Don't do things to change the way the sql optimizer produces plans etc. "it's smart so don't use plan guides, query hints, force recompile etc etc etc etc"
It reminds me of how the cost threshold for parallelism setting, out of the box, is years out of date due to hardware improvements. Today, many shops are using object-relational-mapper ORM technology to produce sql statements.
Is this good? -- often it's horrible. Are we going back to stored procedures? -- not likely.
Developers are using ORM technology and for many reasons we're stuck with it -- such as,
1)it would cost a fortune to find and hire developers who could learn today's complicated business rules, c#, .NET AND be sql gurus
2) today's DBAs are often not involved in the extensive meetings to hash out the business logic for today's applications, so we couldn't necessarily provide stored procedures anyway.
With the combination of ORM sql and in some cases like ours, significant skewing in the data ( cardinality), sql often times out on query compilation and chooses a bad plan ( e.g. a plan that worked for a tiny client with 10k records but not good for another client with data in the same large tables who haS 50 million records )
The articles telling you not to use tweaks often show a sample query like
select name,price from orders where name='Smith' I saw something like that when researching plan guides, then balked when considering placing something like the query below into a plan guide.
So I'm looking at filtered statistics, hints to force index use and, what seems to be the magic wand for a subset of problem queries -- OPTION (RECOMPILE). Of course hints that are easy to add ourselves aren't quite as easy to add to specific ORM-generated queries.
Now I'm beginning to understand why we've needed to run dbcc freeproccache weekly. The optimizer is only going to spend so much time on compilation -- as Kimberley Tripp points out in this video, sql looks for "a good plan, fast" -- NOT A PERFECT PLAN. She discusses filtered statistics, but in my testing, new optimizer-suggested high value indexes and filtered statistics don't necessarily even get used if the query is complex and sql is left to it's own devices.
https://youtu.be/li5HwaZF8tc?list=PLoGAcXKPcRvbTr23ujEN953pLP_nDyZJC
SELECT DISTINCT TOP 2000
CollateralGroupRequest02.SERVICE_REQUEST_ID AS PrimaryKey,
Client18.SHORT_NAME AS ClientShortName,
EeeReason19.DESCRIPTION AS EeeReason,
AccountProperty110.MODIFIED_MANUFACTURER_ID AS VIN,
Account15.CATEGORY AS AccountType,
FollowupEntity111.ENTITY_CODE AS DealerID,
FollowupEntity111.NORMALIZED_ENTITY_CODE AS FollowupEntity111_NORMALIZED_ENTITY_CODE58,
Account15.FINANCED_DATE AS FinancedDate,
Account15.BOOKED_DATE AS BookedDate,
AccountProperty110.EXPECTED_TITLING_STATE_ABBR AS ExpectedTitlingState,
BusinessUnit112.LONG_NAME AS BusinessUnit,
BusinessUnit112.SHORT_NAME AS BusinessUnitId,
CollateralGroupRequest02.SERVICE_REQUEST_STATUS AS Status,
AccountOwnershipDocSummary16.STATUS AS AccountStatus,
AccountOwnershipDocSummary16.BORROWER_FULL_NAMES AS BorrowerFullNames,
Account15.CUSTOM_ATTRIBUTE_1 AS AccountNumber,
Account15.CUSTOM_ATTRIBUTE_2 AS LoanNumber,
Account15.CUSTOM_ATTRIBUTE_3 AS LoanSuffix,
Account15.CUSTOM_ATTRIBUTE_4 AS Branch,
CollateralGroupRequest02.EEE_DATE AS EEEDate,
CollateralGroupRequest02.EEE_CAUSE AS CauseOfEEE,
CollateralGroupRequest02.REQUEST_TRANSACTION_TYPE AS RequestType,
Jurisdiction113.SHORT_NAME AS TMState,
Account15.RECOVERY_STATUS AS RecoveryCode,
Account15.USER_DEFINED_1 AS UserDef1,
Account15.USER_DEFINED_2 AS UserDef2,
Account15.USER_DEFINED_3 AS UserDef3,
LienholderStatusCode114.STATUS_CODE AS LienholderStatus,
AccountProperty110.VEHICLE_TYPE AS CollateralType,
Account15.SUB_CATEGORY AS AccountSubType,
(SELECT
DerivedTable01_11
FROM (SELECT
MIN(ReminderWorkItem02.REMIND_DATE) AS DerivedTable01_11
FROM WORK_QUEUE_ITEM AS ReminderWorkItem02
WHERE ((ReminderWorkItem02.ACCOUNT_ID = Account15.ACCOUNT_ID
AND ReminderWorkItem02.BUSINESS_PROCESS_STATUS = 'Open'
AND ReminderWorkItem02.NAME = 'REMINDER'
AND ReminderWorkItem02.SECURED_ORGANIZATIONORGANIZATION_ID = Account15.CLIENT_ID)
AND ((ReminderWorkItem02.CONCRETE_TYPE IN ('Fdi.Workflow.Po.ReminderWorkItem'))))) AS ScalarQueryTable)
AS ReminderDate,
(SELECT
ExternalUser13_USERNAME13
FROM (SELECT TOP 1
ExternalUser13.USERNAME AS ExternalUser13_USERNAME13,
ReminderWorkItem02.REMIND_DATE AS ReminderWorkItem02_REMIND_DATE1
FROM WORK_QUEUE_ITEM AS ReminderWorkItem02
INNER JOIN USR AS ExternalUser13
ON ReminderWorkItem02.ASSIGNED_USER_ID = ExternalUser13.USR_ID
WHERE ((ReminderWorkItem02.ACCOUNT_ID = Account15.ACCOUNT_ID
AND ReminderWorkItem02.BUSINESS_PROCESS_STATUS = 'Open'
AND ReminderWorkItem02.NAME = 'REMINDER'
AND ReminderWorkItem02.SECURED_ORGANIZATIONORGANIZATION_ID = Account15.CLIENT_ID)
AND ((ReminderWorkItem02.CONCRETE_TYPE IN ('Fdi.Workflow.Po.ReminderWorkItem'))))
ORDER BY 2) AS ScalarQueryTable)
AS ReminderUser
FROM SERVICE_REQUEST AS CollateralGroupRequest02
INNER JOIN (SERVICED_COLLATERAL_GROUP_ITEM AS ServicedCollateralGroupItem13
INNER JOIN (SERVICED_COLLATERAL_GROUP_ITEM AS ServicedAccount14
INNER JOIN (ACCOUNT AS Account15
INNER JOIN ACCOUNT_OWNERSHIP_DOC_SUMMARY AS AccountOwnershipDocSummary16
ON Account15.ACCOUNT_ID = AccountOwnershipDocSummary16.ACCOUNT_ID
INNER JOIN PROPERTY AS AccountProperty110
ON Account15.ACCOUNT_ID = AccountProperty110.ACCOUNT_ID
LEFT OUTER JOIN LEGAL_ENTITY AS FollowupEntity111
ON Account15.FOLLOWUP_ENTITYLEGAL_ENTITY_ID = FollowupEntity111.LEGAL_ENTITY_ID
LEFT OUTER JOIN BUSINESS_UNIT AS BusinessUnit112
ON Account15.BUSINESS_UNIT_ID = BusinessUnit112.BUSINESS_UNIT_ID
LEFT OUTER JOIN LIENHOLDER_STATUS_CODE AS LienholderStatusCode114
ON Account15.LIENHOLDER_STATUS_CODE_ID = LienholderStatusCode114.LIENHOLDER_STATUS_CODE_ID)
ON ServicedAccount14.ACCOUNT_ID = Account15.ACCOUNT_ID)
ON ServicedCollateralGroupItem13.SERVICED_COLLATERAL_GROUP_ITEM_ID = ServicedAccount14.SERVICED_COLLATERAL_GROUP_ITEM_ID)
ON CollateralGroupRequest02.SERVICE_REQUEST_ID = ServicedCollateralGroupItem13.COLLATERAL_GROUP_REQUESTSERVICE_REQUEST_ID
INNER JOIN ORGANIZATION AS Client18
ON CollateralGroupRequest02.CLIENT_ID = Client18.ORGANIZATION_ID
LEFT OUTER JOIN EEE_REASON AS EeeReason19
ON CollateralGroupRequest02.EEE_REASON_ID = EeeReason19.EEE_REASON_ID
INNER JOIN ORGANIZATION AS Jurisdiction113
ON CollateralGroupRequest02.JURISDICTION_ID = Jurisdiction113.ORGANIZATION_ID
WHERE ((CollateralGroupRequest02.CLIENT_ID = 11330
AND Account15.CLIENT_ID = 11330
AND CollateralGroupRequest02.EEE_DATE IS NOT NULL
AND ((CollateralGroupRequest02.CONCRETE_TYPE = 'Fdi.Po.FollowUpRequest'
AND CollateralGroupRequest02.BUSINESS_PROCESS_STATUS IN ('Open', 'Closed')
AND AccountOwnershipDocSummary16.STATUS = 'NO_TITLE_PM')
OR (CollateralGroupRequest02.CONCRETE_TYPE = 'Fdi.Po.TitleMaintenanceRequest'
AND CollateralGroupRequest02.BUSINESS_PROCESS_STATUS = 'Open'
AND AccountOwnershipDocSummary16.STATUS = 'TITLE_MAINTENANCE_REQUEST_SENT')
OR (CollateralGroupRequest02.CONCRETE_TYPE = 'Fdi.Po.DuplicateTitleRequest'
AND CollateralGroupRequest02.BUSINESS_PROCESS_STATUS = 'Closed')
OR (CollateralGroupRequest02.CONCRETE_TYPE = 'Fdi.Po.DirectLendingServiceRequest'
AND CollateralGroupRequest02.BUSINESS_PROCESS_STATUS = 'Open'
AND AccountOwnershipDocSummary16.STATUS = 'NO_TITLE_PM'
OR (CollateralGroupRequest02.SERVICE_REQUEST_STATUS = 'DocumentsReturned'
AND CollateralGroupRequest02.BUSINESS_PROCESS_STATUS = 'Closed'
AND AccountOwnershipDocSummary16.STATUS = 'NO_TITLE_PM'))
OR (CollateralGroupRequest02.CONCRETE_TYPE = 'Fdi.Po.AdHocRequest'
AND CollateralGroupRequest02.BUSINESS_PROCESS_STATUS = 'Open'))
AND (Account15.CUSTOM_ATTRIBUTE_4 = '01102')))
ORDER BY 20

Not really surprising that you see/hear that "advice" often. Those same C# .NET guys that don't know beans about databases are usually the cause of the problem, because their lack of knowledge about databases sends them online to search for solutions, and sometimes the first thing they find is some kind of query hint, that while it works at the time they implement it, turns out in the long run to not be such a good idea. And thus the reason for the existence of all that advice. Query hints and such should be more in the category of "last resort" options. If you have to go there, your design is probably at least partially at fault. Of course, we all know that we are often stuck with a design that is truly awful, and have to find ways to cope because corporate resistance to change is about as far from futile as Captain Picard's ultimate resistance to the Borg was in the TV series Star Trek: The Next Generation. Anyway, the "advice" has always been general guidance. It can't possibly apply to every situation. The fact that you hear/see it often is actually a good thing, so as to discourage the usage of such things, with the hope that it will spur innovation and creative thinking and ultimately better designs will make their way into practice.

That said, and having read your post a couple of times to try and be sure I wasn't missing anything, did you have a particular question, or was this more of a "get it off my chest" thing?

Steve (aka sgmunson) 🙂 🙂 🙂
Rent Servers for Income (picks and shovels strategy)

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 2

"This query was written by a tool". ORM tools can only go so far and this query goes way beyond that. It's unwieldy, inefficient and difficult to engineer into a form amenable to hand crafting. If you were handed this query to tune, you'd have to rewrite it first.

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Indianrock SSC-Insane Points: 20333 More actions · Answer 3

The main question would be: Am I nuts for thinking this way? And, since for the time being we are stuck with our design and ORM-approach, has anyone "tweaked" ORM code to add such hints for specific queries.

We have something in place which "differentiates" between large and small clients, but it's hard-coded, only kicks in for a handful of the largest clients, AND I think all it does is insert a sql comment into specific queries to try and force a new plan.

Problem is, if sql is often doing this ( StatementOptmEarlyAbortReason="TimeOut" RetrievedFromCache="true" ) then the "comment" may do nothing. It seems only Option(recompile) forces sql to take however long is needed to generate an optimal plan for this query with these parameters.

I realize you don't want a large percentage of queries recompiling, but since we have these "reporting" type queries on the same system doing OLTP, something better than daily freeproccache needs to be done. We're rushing to try and move all of the databases to enterprise solid state drives to give the developers more time to fix their queries.

sgmunson SSC Guru Points: 110639 More actions · Answer 4

Indianrock (12/9/2016)
The main question would be: Am I nuts for thinking this way? And, since for the time being we are stuck with our design and ORM-approach, has anyone "tweaked" ORM code to add such hints for specific queries.
We have something in place which "differentiates" between large and small clients, but it's hard-coded, only kicks in for a handful of the largest clients, AND I think all it does is insert a sql comment into specific queries to try and force a new plan.
Problem is, if sql is often doing this ( StatementOptmEarlyAbortReason="TimeOut" RetrievedFromCache="true" ) then the "comment" may do nothing. It seems only Option(recompile) forces sql to take however long is needed to generate an optimal plan for this query with these parameters.
I realize you don't want a large percentage of queries recompiling, but since we have these "reporting" type queries on the same system doing OLTP, something better than daily freeproccache needs to be done. We're rushing to try and move all of the databases to enterprise solid state drives to give the developers more time to fix their queries.

No, you're not nuts for thinking that way. But consider the possibility that at some point, management SHOULD get tired of spending additional dollars to "re-solve" a problem that should have already been solved, but wasn't. The question is, who's going to have the requisite intestinal fortitude to tell them the truth that not spending enough initially is the PRIMARY CAUSE of that problem, because they perceive that spending big bucks on DB talent is "too expensive". Obviously, when they spend more up front on good talent, they are likely to get a much better, more long lasting, and ultimately, less expensive solution. Unfortunately, far too many folks have lost pretty much ALL of their intestinal fortitude.

Steve (aka sgmunson) 🙂 🙂 🙂
Rent Servers for Income (picks and shovels strategy)

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 5

The MANY (although not nearly as many as Oracle has) knobs, OPTIONS, HINTS, trace flags, etc. that SQL Server exposes are there for a reason: the optimizer cannot POSSIBLY get things right all the time. It truly is EXCEPTIONALLY good at it's job - finding a good enough plan quickly enough (NOT, as many thing, finding the BEST plan). To those who know what they are doing and/or know their data and data access patterns, those overrides can make you look like you walk on water.

One thing I will explicitly point out is the "magic wand" as you called it: OPTION (RECOMPILE). But first, another Guruism for the crowd:

I will jump through EXTRAORDINARY hoops to trade CPU ticks for pretty much anything.

The reason I say that, and the reason OPTION (RECOMPILE) is so good, is that CPUs do BILLIONS OF THINGS PER SECOND PER CORE. In our computers, the next level down from that on the performance scale is ~1000 times slower. So trade those CPU ticks to give the optimizer EXACTLY WHAT IT NEEDS to get a good enough plan quickly enough. Avoid data-value-skew or widely-varying-input-parameter caused disastrously bad plans.

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

Indianrock SSC-Insane Points: 20333 More actions · Answer 6

Yes, yes and yes. And correct me if I'm wrong, but once the front end of a plan ( right side as viewed in an execution plan ) has huge variances in estimated/actual rows, that mass has to funnel on down to the left. Not expressing that very well, but just getting at the idea that if sql, left to it's own, pulls a bad plan, the skew / estimate error cascades down through more than just the first operation/node.

See attached plan which is related to the query posted above -- the attached plan is what sql did with no hints or options. I need to do more testing/timings to see if the filtered stats are needed or if option(recompile) on some queries will do the trick for the moment.

I've added the high value missing indexes, as revealed by Brent Ozar's sp_blitzIndex, but once I got rid of the lowest-hanging fruit, I started focusing on unused indexes.

Development makes removing unused indexes a big, slow, pain. They want to spend weeks or more seeing if "the code uses the index" despite DMV results showing some have zero reads after sql is up for over a month.

Indianrock SSC-Insane Points: 20333 More actions · Answer 7

Forgot to mention, when you have huge variances in estimated/actual rows ( e.g. 600 estimated, 1.4 million actual) how likely is it that some of the work dumps out of memory into tempdb?

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 8

Indianrock (12/9/2016)
Yes, yes and yes. And correct me if I'm wrong, but once the front end of a plan ( right side as viewed in an execution plan ) has huge variances in estimated/actual rows, that mass has to funnel on down to the left. Not expressing that very well, but just getting at the idea that if sql, left to it's own, pulls a bad plan, the skew / estimate error cascades down through more than just the first operation/node.
See attached plan which is related to the query posted above -- the attached plan is what sql did with no hints or options. I need to do more testing/timings to see if the filtered stats are needed or if option(recompile) on some queries will do the trick for the moment.
I've added the high value missing indexes, as revealed by Brent Ozar's sp_blitzIndex, but once I got rid of the lowest-hanging fruit, I started focusing on unused indexes.
Development makes removing unused indexes a big, slow, pain. They want to spend weeks or more seeing if "the code uses the index" despite DMV results showing some have zero reads after sql is up for over a month.

ChrisM@Work (12/9/2016)
"This query was written by a tool". ORM tools can only go so far and this query goes way beyond that. It's unwieldy, inefficient and difficult to engineer into a form amenable to hand crafting. If you were handed this query to tune, you'd have to rewrite it first.

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Indianrock SSC-Insane Points: 20333 More actions · Answer 9

Comment from a co-worker dba. We probably shouldn't have turned on forced parameterization years ago, since all application sql is parmameterized and only prod support add hoc work isn't, but I'm not sure about his conclusion here:

"It’s not using new stuff because of FORCED parameterization because it caches the plan and reuses it. If you keep it SIMPLE, I think it would use the new stats/indexes."

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 10

Indianrock (12/9/2016)
Forgot to mention, when you have huge variances in estimated/actual rows ( e.g. 600 estimated, 1.4 million actual) how likely is it that some of the work dumps out of memory into tempdb?

Spooling various things to tempdb is just one of MANY things that can be suboptimal when you have a large disparity (in EITHER DIRECTION) between Estimated and Actual rows. And there are a metric butt-ton of things that can get you those disparities too!!

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

sgmunson SSC Guru Points: 110639 More actions · Answer 11

Indianrock (12/9/2016)
Yes, yes and yes. And correct me if I'm wrong, but once the front end of a plan ( right side as viewed in an execution plan ) has huge variances in estimated/actual rows, that mass has to funnel on down to the left. Not expressing that very well, but just getting at the idea that if sql, left to it's own, pulls a bad plan, the skew / estimate error cascades down through more than just the first operation/node.
See attached plan which is related to the query posted above -- the attached plan is what sql did with no hints or options. I need to do more testing/timings to see if the filtered stats are needed or if option(recompile) on some queries will do the trick for the moment.
I've added the high value missing indexes, as revealed by Brent Ozar's sp_blitzIndex, but once I got rid of the lowest-hanging fruit, I started focusing on unused indexes.
Development makes removing unused indexes a big, slow, pain. They want to spend weeks or more seeing if "the code uses the index" despite DMV results showing some have zero reads after sql is up for over a month.

Or, worse yet, there's a smart-cookie in the group that specifically tries to write queries to generate usage for that specific index in order to try and ensure you don't get rid of it... Been there, seen it, figured out what was going on, and then proved that such was what the individual was doing, and put a stop to it. Fortunately, I had friends in high places at the time.

Steve (aka sgmunson) 🙂 🙂 🙂
Rent Servers for Income (picks and shovels strategy)

Indianrock SSC-Insane Points: 20333 More actions · Answer 12

I'm now playing with plan guides and turning off forced parameterization in QA. So far it isn't using the plan guide -- that's a tough query to get exactly right in the guide. Run this way it does use the new index but not the filtered stats, but it is fast.

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 13

Indianrock (12/9/2016)
Comment from a co-worker dba. We probably shouldn't have turned on forced parameterization years ago, since all application sql is parmameterized and only prod support add hoc work isn't, but I'm not sure about his conclusion here:
"It’s not using new stuff because of FORCED parameterization because it caches the plan and reuses it. If you keep it SIMPLE, I think it would use the new stats/indexes."

Not correct literally, although it could result in suboptimal stuff.

I note that forced parmeterization should still almost certainly be off. I have done SQL Server consulting since way before that option was available, and I have seen exactly one case since it came out where it was a win. That was a BRUTALLY ugly classic ADO looping recordset application.

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

Alan Burstein SSC Guru Points: 61136 More actions · Answer 14

Indianrock (12/9/2016)
I've heard this advice everywhere. Don't do things to change the way the sql optimizer produces plans etc. "it's smart so don't use plan guides, query hints, force recompile etc etc etc etc"

If you don't know what you are doing and don't understand the SQL Server query optimizer then this advice is pretty good. I would say, for example, that 90% of the WITH (NOLOCK) table hints I have seen are unnecessary, unhelpful and counter-intuitive.

Query hints, the ability to force a recompile, etc exist for a reason. There are times when your careful, informed and extensive testing leads you to conclude that a overriding the optimizer's decisions is the way to go. Recently there are times that I have needed to require a stored proc to re-compile before each run, force a parallel execution plan (using Adam Machanic's make_parallel) or force a serial plan using OPTION (MAXDOP 1). Again, in each case for me, these changes occurred after extensive testing and with a full understanding of the risks.

"I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

-- Itzik Ben-Gan 2001