SQL apparently not realising it should ditch a cached execution plan and make a new one

  • I'm working on a reasonably complex database (i.e. many tables, SPs, UDFs, triggers and views). And being good programmers, we have written everything to work in a nice set-based way.

    Unfortunately, we are now running in to occasionally severe performance problems which appear to be the result of SQL using existing execution plans which are no longer appropriate for the task in hand. For example:

    Restore database

    Calculate 'stuff' for everybody (about 36k people).

    Completes in 24 minutes.

    Restore database

    Calculate 'stuff' for 2 people, one after the other, individually.

    Takes a minute or two.

    Calculate 'stuff' for everybody (about 36k people).

    Still running after 2½ hours.

    Restore database

    Calculate 'stuff' for 2 people, one after the other, individually.

    Takes a minute or two.

    Clear out the execution plan cache.

    Calculate 'stuff' for everybody (about 36k people).

    Completes in 24 minutes.

    We could do the following:

    1) Change all the SPs to be "WITH RECOMPILE" but this just feels wrong. And the "calculate" step involves dozens and dozens of SPs.

    2) Have two (or more?) versions of all the SPs; a "small number of people" and a "large number of people" version. Each would require a third SP, which would decide which of the two to call depending on how many people were having their stuff calculated. Nasty!

    3) Some sort of ad hoc clearing of the execution plan cache when we know we're about to calculate 'stuff' for everybody.

    All of these seem a bit bonkers.

    Is there some other option? Or are we doing something fundamentally wrong?

    Thanks in advance.

  • Hard to say without seeing the stored procedures, but my first thought is that you have some bad parameter sniffing. There are many posts elsewhere about it if you want to get an in depth idea of what happens, but the basics are that the query optimizer looks at the value of the parameter the first time you run the query and makes a plan based off of running the query with that parameter value. The plan may be optimized for that input, but really be awful for other parameter values.

    One thing you can do is tell the query to optimize for a particular value if you know what is working well. I know there are other techniques out there for solving a bad parameter sniffing issue, and the articles that are out there will help with that.

  • Smells like a parameter sniffing problem.

    No, WITH RECOMPILE is not the solution. It may 'fix' things, but it's a sledgehammer and it'll send your CPU usage way, way up.

    Can't say much more without more detail, but look on google for Grant Frichey Parameter Sniffing and you'll find some good articles and presentations.

    For reference, what causes a recompile:

    Table structure changes

    Procedure changes

    Stats change (default stats only change after 20% of the data in the table has changed, assuming auto_update is on)

    Explicit sp_recompile

    Plan cache flushed (restore, some sp_configure options, some ALTER Database)

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Hi Gail,

    Maybe I have it wrong (most likely), but I wondered if the gentleman configuring the server for AD-HOC queries might help in this situation?

    Regards,

    D.

  • Duran (2/18/2015)


    Maybe I have it wrong (most likely), but I wondered if the gentleman configuring the server for AD-HOC queries might help in this situation?

    Since they're stored procedures, not ad-hoc queries, probably not.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • DOH!!!

  • On the parameter sniffing suggestion, I've had a closer look at the timings of the disastrously slow call and the one section which really ballooned (from 31 seconds to 108 minutes) was this:

    Exec dbo.TEST_Timing 'Charging_Calculate_Accounts_Sub b', 1

    If (@Mode In (6, 7))

    Delete CCCIDFCFCE

    From dbo.TTemp_ChargingCalculationClientIDsForCFCE CCCIDFCFCE With (Index (IX_TTemp_ChargingCalculationClientIDsForCFCE_SPIDClientID))

    Where (SPID = @@SPID)

    Delete AM

    From dbo.TTemp_ChargingCalculationAllMatches AM With (Index (IX_TTemp_ChargingCalculationAllMatches_SPIDChargingCalculationClientFinanceID))

    Where (SPID = @@SPID)

    Delete CCCFCE

    From dbo.TTemp_ChargingCalculationCFCE CCCFCE With (Index (IX_TTemp_ChargingCalculationCFCE_SPIDClientID))

    Where (SPID = @@SPID)

    Delete CCACP

    From dbo.TTemp_ChargingCalculationActiveClientProvision CCACP With (Index (IX_TTemp_ChargingCalculationActiveClientProvision_SPIDClientID))

    Where (CCACP.SPID = @@SPID)

    Delete CCROR

    From dbo.TTemp_ChargingCalculationResidentialOrRespite CCROR With (Index (PK_TTemp_ChargingCalculationResidentialOrRespite))

    Where (CCROR.SPID = @@SPID)

    Delete CCBA

    From dbo.TTemp_ChargingCalculationBackdatedAdjustment CCBA With (Index (IX_TTemp_ChargingCalculationBackdatedAdjustment_SPID))

    Where (CCBA.SPID = @@SPID)

    -- If we're doing Detailed Charges, we need to keep CF and CP.

    If (@Mode <> 6)

    Begin

    Delete CCCF

    From dbo.TTemp_ChargingCalculationClientFinance CCCF With (Index (IX_TTemp_ChargingCalculationClientFinance_SPIDIsDeferredPaymentsFinancialAssessmentTypeID))

    Where (SPID = @@SPID)

    Delete CCCP

    From dbo.TTemp_ChargingCalculationClientProvision CCCP With (Index (IX_TTemp_ChargingCalculationClientProvision_SPIDChargingCategoryClassIDServiceFinancialAssessmentTypeIDTimetabledNonHolidayWeekl))

    Where (SPID = @@SPID)

    End

    Exec dbo.TEST_Timing 'Charging_Calculate_Accounts_Sub b', 0

    So perhaps it's an initial call with @Mode = 1 (say) which was OK, followed by @Mode = 7 was was NG. If that's the case, we presumably just need to move these sections of qualified code out in to their own, separate SPs which will have their own individual execution plans.

    Does that seem like a sensible approach?

  • Please try updating statics i had to fo this frequently at one time for performance stats

  • Running the command

    EXEC DBCC FreeProcCache

    and witnessing the expected performance makes your case for parameter sniffing.

    ----------------------------------------------------

  • Glenn Berry is where I go first for performance diagnosis.

    He has a query that I am putting here.

    -- Find single-use, ad-hoc and prepared queries that are bloating the plan cache (Query 46) (Ad hoc Queries)

    SELECT TOP(50) [text] AS [QueryText], cp.cacheobjtype, cp.objtype, cp.size_in_bytes/1024 AS [Plan Size in KB]

    FROM sys.dm_exec_cached_plans AS cp WITH (NOLOCK)

    CROSS APPLY sys.dm_exec_sql_text(plan_handle)

    WHERE cp.cacheobjtype = N'Compiled Plan'

    AND cp.objtype IN (N'Adhoc', N'Prepared')

    AND cp.usecounts = 1

    ORDER BY cp.size_in_bytes DESC OPTION (RECOMPILE);

    -- Gives you the text, type and size of single-use ad-hoc and prepared queries that waste space in the plan cache

    -- Enabling 'optimize for ad hoc workloads' for the instance can help (SQL Server 2008 and above only)

    -- Running DBCC FREESYSTEMCACHE ('SQL Plans') periodically may be required to better control this.

    -- Enabling forced parameterization for the database can help, but test first!

    If I see high SQL Plan cache, I schedule a job that periodically runs to free the SQL Plan cache based on the results of this query. This is the ad hoc cache that never gets reused anyway. It frees up memory.

    https://sqlserverperformance.wordpress.com/2014/09/17/sql-server-diagnostic-information-queries-for-september-2014/[/url]

    Glenn also walks you through a lot of these queries in his pluralsight courses.

    http://www.pluralsight.com/author/glenn-berry

    Also, you should keep a baseline of performance data and trend it over time to see what changes as your stored procedures and indexing strategy changes.

    Erin Stellato has a great course on Pluralsight that will help you start baselining.

    http://www.pluralsight.com/courses/sqlserver-benchmarking

    She has several articles here that I have used as well.

    http://www.sqlservercentral.com/Authors/Articles/Erin_Stellato/351331/[/url]

  • julian.fletcher (2/19/2015)


    So perhaps it's an initial call with @Mode = 1 (say) which was OK, followed by @Mode = 7 was was NG. If that's the case, we presumably just need to move these sections of qualified code out in to their own, separate SPs which will have their own individual execution plans.

    Does that seem like a sensible approach?

    Yes, absolutely. You could be running into this: http://sqlinthewild.co.za/index.php/2009/09/15/multiple-execution-paths/

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • One more place to check is Brad McGehee's website. I forgot about him earlier this morning. :w00t:

    http://www.sql-server-performance.com

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply