Parameter Sniffing with sp_ExecuteSQL ?

  • Does anyone know if Parameter Sniffing can occur when using dynamic SQL through sp_ExecuteSQL (and no other stored procedures)?

    If so, what are the recommended fixes?

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • I would imagine so. It's a cached plan with parameters, not much different from a stored proc.

    Fixes would be the same as for a stored proc, use variables instead of parameters, split into separate queries if possible, use the recompile hint.

    Do you have an example, or is it a theoretical question?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Not sure. I am considering parameter sniffing as a possibility for why the query plan for a query is so wrong. I have attached it as a .TXT file (though it is really a .sqlplan). It estimates something like 8 seconds, but the actual time is around 900 seconds. The problem seems to be that the Evt_NS_Client_Config_Request table is completely mis-estimated/handled by the optimizer. It has apprx 44 million rows but as you can see, it is only estimating about 2.5 seconds to deal with it. In fact, almost all of its time is spent on this table.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • That looks like the estimated plan. Is there any possibility of seeing the actual plan please?

    The only parameter I can see in there is @collectionGuid. Will vastly different numbers of rows be affected for different values of that?

    Where are you getting the estimated time from?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (10/3/2008)


    That looks like the estimated plan. Is there any possibility of seeing the actual plan please?

    Probably not from the production environment in the time that I have left (it is very hard to catch & extract and I am done in 2 hours).

    The only parameter I can see in there is @collectionGuid. Will vastly different numbers of rows be affected for different values of that?

    The times do not vary much with the parameters. I am not suspecting parameter sniffinf anymore...

    Where are you getting the estimated time from?

    Subtree cost of the root node of the second statment.

    At this point, I am convinced that the problem is that the optimiser is seriously mis-estimating the row counts from most of the table lookups and one of the joins. I am not sure why, the statistics are up to date as far as I can tell.

    For instance, here is the problem query, reformatted for testing & re-execution (after the temp tables are made):

    Declare @collectionGuid uniqueidentifier,@LastRunDate datetime

    Select @collectionGuid='B6A61E53-BCCA-4336-B15A-8FD2975F67C9',@LastRunDate='Oct 2 2008 11:26:29:490AM'

    --====== NOTE: this is the Query that is taking +15 minutes

    INSERT INTO [#0ea9a6bfe6f948ce8ed5e5e1187f930c](ResourceGuid)

    (SELECT ResourceGuid

    FROM CollectionIncludeResource

    WHERE CollectionGuid = @collectionGuid

    UNION

    SELECT ResourceGuid

    FROM CollectionMembership cm

    JOIN CollectionIncludeCollection cic ON cm.CollectionGuid = cic.SubCollectionGuid

    WHERE cic.CollectionGuid = @collectionGuid

    )

    UNION

    SELECT w.Guid AS ResourceGuid

    FROM #vResource w

    WHERE (([w].[Guid] in (

    select Guid from (

    select Guid from #vResource where Guid in (

    --====== This is the problem subquery, marked with **

    select a._ResourceGuid as Guid

    from Inv_AeX_AC_Client_Agent a--**

    inner join #vComputer c on c.Guid = a._ResourceGuid

    inner join Evt_NS_Client_Config_Request ccr--**

    on ccr.ResourceGuid = a._ResourceGuid--**

    where a.[Agent Name] = 'Altiris Software Update Agent'

    --====== End problem area ======

    and a._ResourceGuid not in (

    --====== mis-estimated subquery:

    select pr.ResourceGuid

    from vItem i

    inner join vPolicyAppliesToResource pr on pr.PolicyGuid = i.Guid

    where i.ClassGuid = '5e5bde22-c290-4a94-a36c-c5076da6d565'

    and CollectionGuid <> 'b6a61e53-bcca-4336-b15a-8fd2975f67c9'

    --====== end mis-estimated subquery

    )

    and CAST(a.[Build Number] as INT) >= 2437 -- 6.2 GA agent build number

    group by a._ResourceGuid

    having datediff(dd,max(ccr._eventTime),getdate()) < 90

    )

    ) tab

    )

    ))

    ... the "mis-estimated" subquery is estimated at 27000 rows, but is actually 0 rows. If I replace it with nothing, or extract it to temp table built before this query (takes about 1 second), a different execution plan is made and run-time drops by a factor of 3x-4x.

    I'll post a follow-up later tonight or tomorrow...

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • rbarryyoung (10/3/2008)


    GilaMonster (10/3/2008)


    That looks like the estimated plan. Is there any possibility of seeing the actual plan please?

    Probably not from the production environment in the time that I have left (it is very hard to catch & extract and I am done in 2 hours).

    Tomorrow then? Seeing where the bad row estimations are should help a lot here.

    You could run the update within a transaction and roll it back afterwards. It's the simplest way of getting the actual plan, though it may impact other stuff in the system if the DB is busy

    Where are you getting the estimated time from?

    Subtree cost of the root node of the second statment.

    The costs aren't times. They're a unit-less measure calculated from estimated time, estimated CPU, estimated IO, estimated memory and probably a few other things. They're just a measure of how expensive the query will be to execute, not how long it will take.

    ... the "mis-estimated" subquery is estimated at 27000 rows, but is actually 0 rows. If I replace it with nothing, or extract it to temp table built before this query (takes about 1 second), a different execution plan is made and run-time drops by a factor of 3x-4x.

    I'll post a follow-up later tonight or tomorrow...

    I don't suppose you're in a position to be able to change the query? That would be the simplest, since you know it helps. I did note a missing index listed in the plan. Dunno if adding it will be of any help.

    I'll check in tomorrow. Is almost bed-time here.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • OK, here is the SHOWPLAN XML Statistics from after a run. Note that this is not from the production system, but rather from my laptop after I copied the database to my portable USB drive, however the run-times are similar.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Cool. I'll take a look and get back to you tomorrow or monday. It's late here.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Very quick first glance.

    There are two places with very bad estimates:

    The clustered index scan of Inv_AeX_AC_Client_Agent, which estimates 1901 rows and gets over 6000. It may be that this is due to the conversion. What data type is BuildNumber?

    I would suggest a NC index there, on Agent Name, Build Number include _resourceGuid. See if it helps at all.

    The second problem is on the index seek Evt_NS_Client_Config_Request. SQL's estimating 7700 rows and it's executing the operator 6300 times. That's 48 000 000 rows that it will have to process. The multiple executes are because that is the outer of a nested loop join.

    Those 48 000 000 rows are then getting joined to the rowset coming from Inv_AeX_AC_Client_Agent, then aggregated to get back to the 6300 rows we started with. It's a many-many join that's resulting is a huge resultset and the group by's acting like a distinct.

    Are you using anything from Inv_AeX_AC_Client_Agent table? If it's just in the join to filter (inner join) then can you try converting that into an exists in the where clause? Since the exists is just true or false, not rows returned, that should eliminate the massive number of rows meen produced by the join, then aggregated away.

    Can you change that query at all?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Where are you getting the estimated time from?

    Subtree cost of the root node of the second statment.

    The costs aren't times. They're a unit-less measure calculated from estimated time, estimated CPU, estimated IO, estimated memory and probably a few other things. They're just a measure of how expensive the query will be to execute, not how long it will take.

    Right, but translating estimated resource costs into estimated service times is usually pretty simple, especially if the scaling factor is close to one, which in my experience, it usually is.

    I don't suppose you're in a position to be able to change the query? That would be the simplest, since you know it helps.

    Nope, the SQL commands are being issued by the products application server, specifically its task-server, so all I can do is advise my customer to request the vendor to make certain changes.

    Edit: I should say, that all I can do wrt to Code changes... I can add indexes with tacit vendor permission, which I have.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • GilaMonster (10/4/2008)


    There are two places with very bad estimates:

    The clustered index scan of Inv_AeX_AC_Client_Agent, which estimates 1901 rows and gets over 6000. It may be that this is due to the conversion. What data type is BuildNumber?

    I would suggest a NC index there, on Agent Name, Build Number include _resourceGuid. See if it helps at all.

    Right. BuildNumber is an nvarchar() that only contains numeric strings, seriously bad design oversight. Consequently, a GreaterThan comparision on BuildNumber cannot be indexed, although it could still retrieve its value from an index that included it. AgentName had low selectivity (only three values, that I could see) but could still preface _resourceGuid which is the only probe, join and return column, so I went with an NC Index over ([Agent Name], _ResourceGuid, [Build Number]).

    The second problem is on the index seek Evt_NS_Client_Config_Request. SQL's estimating 7700 rows and it's executing the operator 6300 times. That's 48 000 000 rows that it will have to process. The multiple executes are because that is the outer of a nested loop join.

    Those 48 000 000 rows are then getting joined to the rowset coming from Inv_AeX_AC_Client_Agent, then aggregated to get back to the 6300 rows we started with. It's a many-many join that's resulting is a huge resultset and the group by's acting like a distinct.

    Yep, saw that also.

    Are you using anything from Inv_AeX_AC_Client_Agent table? If it's just in the join to filter (inner join) then can you try converting that into an exists in the where clause? Since the exists is just true or false, not rows returned, that should eliminate the massive number of rows meen produced by the join, then aggregated away.

    Since I could not change the code, I spent more time on looking for promising indexes. However, I did notice several instances of just what you point out. Apparently, the code writers really took to heart the many truisms about avoiding IN and EXISTS subqueries and changed them all into joins. My guess is that they were unaware that joins are not always better than subqueries as this particular query proves.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Looks like the took the 'don't use distinct' to heart too. Guess they didn't realise that a group by with no aggregations does exactly the same thing

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • This rewrite should produce the same result

    INSERT[#0ea9a6bfe6f948ce8ed5e5e1187f930c]

    (

    ResourceGuid

    )

    SELECTResourceGuid

    FROMCollectionIncludeResource

    WHERECollectionGuid = @collectionGuid

    UNION

    SELECTResourceGuid

    FROMCollectionMembership AS cm

    INNER JOINCollectionIncludeCollection AS cic ON cm.CollectionGuid = cic.SubCollectionGuid

    WHEREcic.CollectionGuid = @collectionGuid

    UNION

    SELECTw.GUID

    FROM#vResource AS w

    INNER JOIN(

    SELECT_ResourceGUID

    FROMInv_AeX_AC_Client_Agent

    WHERE[Agent Name] = 'Altiris Software Update Agent'

    AND CAST([Build Number] AS INT) >= 2437

    GROUP BY_ResourceGUID

    ) AS a ON a._ResourceGUID = w.GUID

    INNER JOIN#vComputer AS c ON c.GUID = a._ResourceGUID

    INNER JOIN(

    SELECTResourceGUID

    FROMEvt_NS_Client_Config_Request

    WHERE_eventTime > DATEDIFF(DAY, 90, GETDATE())

    GROUP BYResourceGUID

    ) AS ccr ON ccr.ResourceGUID = a._ResourceGUID

    LEFT JOIN(

    SELECTpr.ResourceGUID

    FROMvItem AS i

    INNER JOINvPolicyAppliesToResource AS pr on pr.PolicyGuid = i.Guid

    WHEREi.ClassGuid = '5e5bde22-c290-4a94-a36c-c5076da6d565'

    and CollectionGuid <> 'b6a61e53-bcca-4336-b15a-8fd2975f67c9'

    GROUP BYpr.ResourceGUID

    ) AS yak ON yak.ResourceGUID = a._ResourceGUID

    WHEREyak.ResourceGUID IS NULL

    -- OPTION (FORCE ORDER)

    If you want to use the FORCE ORDER query hint, change order of JOINS so that the JOIN with least records appears on top and most records at bottom.

    The real boost should be putting the 90-day check in the derived table.

    According to plan, the filter occurs much later, after a join and a stream aggregate.

    Now it's easier to see that

    * Inv_AeX_AC_Client_Agent need a nonclustered index over [Agent Name], [Build Number] and _ResourceGUID.

    * Evt_NS_Client_Config_Request needs a clustered index over _eventTime and ResourceGUID.

    I now read that query rewrite is not an option. However, I am still interested in a side-by-side comparison for the two queries.

    If not else for giving Norton a decent suggestion how to write their queries.


    N 56°04'39.16"
    E 12°55'05.25"

  • 1) yes, you can get parameter-sniffing type problems from sp_executesql.

    2) I wonder if the use of GUIDs is enabling a problem here? Gosh I hate those things from a performance standpoint! But I love them from a consultant-cleaning-up-the-mess standpoint!! 😀

    3) Are stats up to date?

    4) If you use RECOMPILE option on query do you get same results? I don't think this will apply using sp_executesql, but autoparameterization could have cached a plan for the text-based execution I suppose.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Just to clarify some stuff that may not have been clear:

    1) This is a vendor-product generated query, I cannot rewrite it. I have sent suggestions to the vendor on how they could rewrite it for improvement.

    2) There is no discernible problem with the statistics.

    3) Parameter-sniffing was not the problem.

    4) I finished this up on Friday night. Nonetheless, I welcome any comments or insights.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply