Very odd behavior with parameter sniffing, overall poor performance post CU19

  • Hey all,

    Seeing something very strange. Overall performance on a SQL 2017 instance has dramatically worsen since CU19 was installed. Have historical records on nightly processing that shows that post CU19, report execution times went through the roof.

    One in specific, used to run consecutively in 1 minute. After CU19, it jumped up to 10 minutes and continued to climb. Basic elements like index/statistic management have been covered.

    Dove into the query, and found a specific query that ... I'm having a hard time understanding. If I leave it as-is, it takes ~20-40 minutes. If I remove the parameter in favor of a string date value, it's 1 second. If I add a OPTION (RECOMPILE) with the parameter intact, it's 1 second. Have updated all stats and rebuilt all indexes related to the query below, doesn't change anything. I did a a FREEPROCACHE and even restart SQL ... no change.

    I don't yet know if CU19 truly is the culprit, it's just the timing is very odd. No schema changes have happened. The overall database is suffering performance issues all post that update. We're working now on identifying more reports/queries that have recorded baselines pre update that were faster, to see if we can recreate this exact same scenario with having to use RECOMPILE. Want to ensure we're not chasing a red herring; however, gut says this behavior is now being seen elsewhere.

    Query below, any initial thoughts to better hone our focus?

    Thanks!

    SELECT MAX(PlacementActivity.EndDate) AS LastPlacementDate,
    MAX(PlacementActivity.ProviderID) AS ProviderID
    --INTO #TEMPLASTPLACEMENT
    FROM PlacementActivity
    WHERE PlacementActivity.BeginDate < dateadd(day,1,@EndDate)

    AND PlacementActivity.ProviderID NOT IN (
    SELECT DISTINCT PlacementActivity.ProviderID
    FROM PlacementActivity
    LEFT OUTER JOIN Provider ON PlacementActivity.ProviderID = Provider.ProviderID
    left outer join Agency on Provider.AgencyID = Agency.AgencyID
    WHERE PlacementActivity.BeginDate < dateadd(day,1,@EndDate)
    AND (PlacementActivity.EndDate IS NULL OR PlacementActivity.EndDate >= dateadd(day,1,@EndDate))
    )
    AND PlacementActivity.EndDate IS NOT NULL
    GROUP BY PlacementActivity.ProviderID
  • Is @EndDate a parameter or a local variable? If it's a parameter, I would assume that parameter sniffing should, in general, give you similar performance to having a hard-coded value.

    However, if it's a local variable, that pretty much explains the behavior. A local variable will use an average across your statistics. A hard-coded value will, of course, use the value. A recompile results in variable sniffing. Just like parameter sniffing, but with variables, which, acts like the hard-coded value again.

    Now, why would this suddenly change because of CU-19? Not a clue. It's possible that your stats have crossed some threshold where the variable used to work, but now it doesn't? Maybe? Guessing there.

    However, all that is mute if it's a parameter. Then I haven't a clue.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Hey Grant, thanks for the reply.

    The code above, is all I'm running. It's taken from the sproc; however, just as an adhoc T-SQL. I can recreate the behavior meaning, hard coding the variable as shown in the above code.

    I would have expected a statistic update to resolve, but it hasn't. For now we've simply used the RECOMPILE and things are working well. I'd prefer not to rely on this as it appears we're seeing similar behavior elsewhere. Working on finding some more specifics before we determine that it's the same symptoms.

  • So it is a parameter & not a local variable?

    Hmmm.... What about making it a local variable. Instead of doing the calculation on the date value there, do it on a local variable and then use it. What happens then?

    Speaking of calculation, did you simply replace the parameter/variable in the tests, or did you replace the calculation entirely? That would change everything.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Yeah I was thinking about taking the parameter and assigning it to a local variable, but before even going there ... why does this behavior exist as an adhoc statement? As you see above, I'm assigning the variable a value and the issue exists. That's the statement as-is, only setting the variable.

    I should have mentioned. As you can see, multiple references to @EndDate. Changing any of them to a manual value, doesn't impact. The one that actually causes the 1 sec to 40 minute runtime, is this one:

    AND (PlacementActivity.EndDate IS NULL OR PlacementActivity.EndDate >= dateadd(day,1,@EndDate)))

    As soon as I either change this to the actual date, OR use RECOMPILE, everything is fine.

  • Have you looked at the plans yet to see what the differences in row estimates and behavior are?

    I read through the update notes. I didn't spot anything that looks like a change that would impact your situation. Closest thing I saw was the change in UPPER/LOWER/RTRIM.

    At this point, I've got nothing. I think looking at the plans and the row counts might provide a clue or two. Also, look at the plan properties to see if there are differences between the statistics used. Again, might be a clue.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Exec plans are on my to do, I haven't had the time to let it actually finish (the full 40 minute).

    If it helps ... the EndDate from the PlacementActivity, is primarily NULL vs. actual dates. The distribution is largely favored towards NULL.

    I tried for OPTIMIZE FOR (@EndDate UNKNOWN), no change.

  • Just getting the Execution Plan should be enough. Getting the Plan Plus Runtime Metrics won't add much to your information. Use the "Estimated Plan" button and that will give you all the row estimates.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Soooo, I let the query run, it was up to 3.5 hours! I am beyond amazed here. I cancelled it, added a new index (estimated exec plan gave me an idea), still stuck.

    Attached are both versions - with recompile and the original but as an estimate, not actual.

    The major hit on the default query, is spending time on a Clustered Index Scan, on PlacementActivity - which is what EndDate is filtering on. Yet with the RECOMPILE, it's using that brand new Index I just made. If I remove it, it reverts to an index scan on an index for BeginDate and EndDate.

     

  • I added an index hint option, execution plan shows it's using it ... but it still hangs! What the hell! (attached)

  • OK well, I guess we can't upload .sqlplan files eh?

  • Yeah, you have to save them as XML and then they can upload.

    Index hints are EXTREMELY hit or miss and mostly miss. The plan will show the index being used, but you'll see a lot of other differences. Hints are very seldom the way to go. It's almost always about code, structure and statistics.

    Just get estimated plans, execution plans. We need to know what the row estimates are, not the final row counts. So no need capture run time metrics.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Attached, hopefully!

    Attachments:
    You must be logged in to view attached files.
  • did you try cleaning the query and using exists instead of not in ?

    Assuming that the query is exactly as you supplied the following could be changed - may or not affect your timings.

    SELECT MAX(pa1.EndDate) AS LastPlacementDate,
    pa1.ProviderID AS ProviderID -- removed max - it is the group by field so not needed
    --INTO #TEMPLASTPLACEMENT
    FROM PlacementActivity pa1
    WHERE pa1.BeginDate < dateadd(day,1,@EndDate)

    AND pa1.ProviderID NOT IN (
    SELECT DISTINCT pa2.ProviderID
    FROM PlacementActivity pa2
    -- left outers not required as not used anywhwere on the query so removed them
    --LEFT OUTER JOIN Provider ON PlacementActivity.ProviderID = Provider.ProviderID
    --left outer join Agency on Provider.AgencyID = Agency.AgencyID
    WHERE pa2.BeginDate < dateadd(day,1,@EndDate)
    AND (pa2.EndDate IS NULL OR pa2.EndDate >= dateadd(day,1,@EndDate))
    )
    AND pa1.EndDate IS NOT NULL
    GROUP BY pa1.ProviderID



    SELECT MAX(pa1.EndDate) AS LastPlacementDate,
    pa1.ProviderID AS ProviderID -- removed max - it is the group by field so not needed
    --INTO #TEMPLASTPLACEMENT
    FROM PlacementActivity pa1
    WHERE pa1.BeginDate < dateadd(day,1,@EndDate)

    AND not exists (
    SELECT pa2.ProviderID
    FROM PlacementActivity pa2
    -- left outers not required as not used anywhwere on the query so removed them
    --LEFT OUTER JOIN Provider ON PlacementActivity.ProviderID = Provider.ProviderID
    --left outer join Agency on Provider.AgencyID = Agency.AgencyID
    WHERE pa2.BeginDate < dateadd(day,1,@EndDate)
    AND (pa2.EndDate IS NULL OR pa2.EndDate >= dateadd(day,1,@EndDate))
    and pa2.ProviderID = pa1.ProviderID
    )
    AND pa1.EndDate IS NOT NULL
    GROUP BY pa1.ProviderID
  • @frederico_fonseca - awesome find! Yeah, that absolutely worked, just as fast as using the RECOMPILE hint. Thank you!

    I'm still not yet clear on why the original problem exists though. My DBA skills are very much so rusted; however, when doing this outside of a proc, using a local variable, how am I still causing parameter sniffing?

    I have to assume it's the variation of data on EndDate? NULL owns the larger count (3,472 of the overall 13,424 unique values). So if in fact this was working up until recently, why now is this poor performance surfacing?

    Thanks!

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply