much worse execution plan calling stored procedure vs. query window

  • Hi everyone,

    I can post DDL and execution plans if desired, but perhaps this question can be answered on a theoretical level.

    Just want to say that I have inherited this code, and had no hand in designing any part of the following....

    I have a stored procedure that accepts a payrollID as a parameter. When I call the procedure with:

    EXEC dbo.GetPersonalTaxesAccumulators @PayrollID = 345

    it takes 1.1 seconds, which is way too long.

    If I break out the body of the procedure to a query window, and set up the correct parameter values, it takes 48 milliseconds.

    However, if in the body of the proc, I do this:

    ALTER PROCEDURE [dbo].[GetPersonalTaxesAccumulators_v2]

    (@PayrollID1 int)

    AS

    BEGIN

    DECLARE @PayrollID int

    SET @PayrollID = @PayrollID1

    ......etc.....

    and call EXEC dbo.GetPersonalTaxesAccumulators @PayrollID1 = 345

    it executes in 48 milliseconds.

    Not sure why the optimizer is choosing a sub-optimal plan.

    Anyone run into something like this before?

    Thanks in advance --

    SQLNYC

  • sounds a lot like parameter sniffing;

    are you able to modify the procedure?

    I'd bet that by adding option (OPTIMIZE FOR (@PayrollID1 UNKNOWN)) it would build better plans.

    ALTER PROCEDURE [dbo].[GetPersonalTaxesAccumulators_v2]

    ( @PayrollID1 int)

    AS

    BEGIN

    DECLARE @PayrollID int

    ...

    SELECT ...{stuff}

    ORDER BY ...

    (OPTIMIZE FOR (@PayrollID UNKNOWN))

    END

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • This kind of question appears on SQL forums quite often...

    There are few possible reasons for this to happen, parameter sniffing is one of them.

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • Thanks for the replies.

    My understanding about parameter sniffing is that the optimizer uses the plan that was compiled after first use, which may not be optimal for subsequent uses.

    But the issue we are seeing is that if we drop and re-create the stored procedure, performance is terrible the very first time it's called, and for subsequent calls, all using the same value for the payrollID.

    And so if it's not parameter sniffing, what else might be the cause of this? I have verified that Stats are up to date.

    Thanks again --

    SQLNYC

  • What is your default connection option in EM for ANSI_NULLS? I think it will be OFF. In old QA it used to be ON...

    Try to create your proc as per following:

    SET QUOTED_IDENTIFIER OFF

    GO

    SET ANSI_NULLS ON

    GO

    CREATE PROCEDURE ...

    Does it make any difference?

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • Hi Eugene,

    Thanks for your help.

    ANSI NULLs was set to on in my EM, as was QuotedIdentifiers. I set QuotedIdentifiers to OFF, and then checked the proc. They were both ON there, and I set them as you suggested, dropped and re-created the proc.

    Unfortunately, it didn't make any difference.

    SQLNYC

  • Looking at execution plans is not foreign to me, but I'm no guru. Is there a way to determine why the optimizer chose a given path?

    Thanks --

    SQLNYC

  • sqlnyc (8/10/2012)


    Thanks for the replies.

    My understanding about parameter sniffing is that the optimizer uses the plan that was compiled after first use, which may not be optimal for subsequent uses.

    But the issue we are seeing is that if we drop and re-create the stored procedure, performance is terrible the very first time it's called, and for subsequent calls, all using the same value for the payrollID.

    And so if it's not parameter sniffing, what else might be the cause of this? I have verified that Stats are up to date.

    Thanks again --

    SQLNYC

    Hi!

    From your words I conclude that the testing value is no good for building a plan. Please verify it adding option(recompile) to your proc.

    If it is so, then there is a question, why optimizer fails to build a good plan for known value.

    There might be several reasons.

    The first to check is - bad statistics. But you are saying that stats is up to date. But, was it built with fullscan option? To exclude this reason I advice you to update statistics like that: update statistics mytable with fullscan.

    If this is not helpful, than we conclude that optimizer is mistaken to estimate proper quantity of rows for that value, instead the guess about unknown value fits better.

    And again there might be a several reasons.

    To name first, I'll say that it may be an accumulating estimate error due to mathematical model of optimizer. For example non-uniform real data distribution when optimizer think's it is uniform. It may lead to estimated 70 rows vs. actual 10 000 rows (in my last case). So when estimating for unknown and using guess, for example inequality guess abot 30% it may be accidentally close to what it actually happen.

    The second reason might be an imperfection of the building stats agorythm. Yes it happens, and there is interesting example of it on connect site.

    There also might be another reasons, that I couldn't remember at once.

    But all the answers are in actual execution plan. So, maybe to determine a reason for sure, you should provide those actual plans (quick and slow) in xml. And, if needed, maybe some extra info about indexes and statistics.

    And, anyway, from what you described, Lowell advice about optimixe for unknown should be helpfull. But if you want to know the reasons, as I said< you should look for it in the actual plans.


    I am really sorry for my poor gramma. And I hope that value of my answers will outweigh the harm for your eyes.
    Blog: http://somewheresomehow.ru[/url]
    Twitter: @SomewereSomehow

  • Could you post the plans please (as .sqlplan files)?

    It's almost certainly a form of parameter sniffing (the optimiser can sniff the parameter values in the procedure, but it cannot sniff the variable values in the query window), but need to see the plan and probably ask additional questions to figure out exactly what's happening.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Thanks SomewhereSomehow and Gail for your great effort. I have attached the execution plans.

    Yes, the OPTIMIZE FOR UNKNOWN works.

    SQLNYC

  • Just updated stats with FULLSCAN on the relevant tables. Unfortunately this did not make any improvement.

  • Looks like SQL is seriously over-estimating the rows in several cases... Leading to a far more expensive plan than necessary

    Can you try the procedure with the query With PersonalIDs.. given a MAXDOP 1 hint? The optimal plan is serial, the bad plan is parallel.

    Widen the index PayEntries.IX_PayEntries_PayBatchID and add EmployeeID as an include column.

    Widen the index Employees.IX_Employees_EmployeePersonalID and add EmployeeID and DivisionID as include columns

    Consider widening the index [CheckTaxes].[IX_CheckHeaderID], include TaxCodeId, GrossWages, SubjectWages, ExemptWages, TaxableWages, TaxAmount, UncollectedTaxes

    Can you update stats on Employees, CheckTaxes and the tables those two join to with fullscan?

    It's this join that's the problem, along with the join to CheckTaxes on CheckHeaderID:

    INNER JOIN PersonalIds

    ON YTDChecks.EmployeePersonalID = PersonalIds.EmployeePersonalID. Can you make sure that both sides of those joins are indexed, unique where applicable?

    It may also be more optimal to insert the results of the function into a temp table, index that and join it it. It's worth testing.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Hi Gail,

    Thanks so much for your help.

    I tried MAXDOP(1), and there was only a slight difference in execution time. Both were ~1500ms.

    Here is the definition for PersonalID - it is a CTE and therefore can't be indexed. In my earlier testing, when I placed the results of the SELECT statement that defines PersonalID in a #temp table, performance was excellent. I did however notice the serious over-estimation of row counts even with a good plan.

    WITH PersonalIds ( EmployeePersonalID ) AS

    (

    SELECT Employees.EmployeePersonalID

    FROM

    Employees

    INNER JOIN Payentries

    ON Employees.EmployeeID = PayEntries.EmployeeID

    INNER JOIN PayBatches

    ON Payentries.PayBatchID = PayBatches.PayBatchID

    WHERE

    PayBatches.DivisionPayrollID = @PayrollID

    GROUP BY

    Employees.EmployeePersonalID

    )

    Stats have been update with fullscan on all tables affected by this query.

    My manager has a deep hatred of #temp tables, and instead prefers us to tweak code and indexes. I personally disagree, but it's not my choice.

    I'll try the indexes you suggest, but in one case alone it will add 2.5GB to the database, as it's a table with ~31 million rows.

    SQLNYC

  • I didn't suggest indexing the CTE. Ensure that the tables that take part in that join are indexed.

    Temp tables are a tool, use then where they work. Hating temp tables is like a carpenter hating circular saws.

    Can you post the plan from the proc with the maxdop 1 hint?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Gail - here is the plan with the MAXDOP hint. I was able to create all of the indexes you suggested except for one on the CheckTaxes table.

    SQLNYC

Viewing 15 posts - 1 through 15 (of 19 total)

You must be logged in to reply to this topic. Login to reply