Upgrade to SQL2019 from 2012 and everything is slow - what to look for?

  • We have recently started a process of planning the upgrade of our four-node SQL2012 availability group to SQL2019. The main database is pretty large (a few TB) and we've spent years tuning the SQL to get the performance to where it needs to be such that huge complex procedures absolutely fly on SQL2012. We've had performance contractors in in the past who've taken all the low hanging fruit out and much of the less easy wins as well, so we are optimised to the hilt on 2012 and have very few performance problems.

    To plan the upgrade we've taken one of our live nodes out of the group, upgraded to 2019 and restored the databases onto it, set the compat mode to 2019, then rebuilt indexes etc (we're not certain that part is required if anyone knows for definite) and if we leave all the new shiny SQL2019 features enabled we get abysmal performance. For example we have one procedure that creates a table variable, inserts two rows, then does a query based on that and a bunch of other tables including one in another DB on the same box (also 2019 compat mode, also with all its indexes rebuilt, etc). In SQL2012 it runs in, at worst, just over a minute, but in SQL2019 with default settings and all other things being equal, it takes almost 3. If we turn on the "Legacy Cardinality Estimator" that makes a big difference and brings it down to about 33% slower than 2012, but we don't really want to do that because that's a database level setting that will prevent us from feeling the benefit of the new tech in SQL2019.

    Has anyone else had similar experiences and managed to get around them without having to run SQL2019 in legacy mode? We have too much code in there for refactoring everything to be an option, and we're confused in any case as, like I said, we've spent years making it more and more efficient. As it stands, if the database experts in the company took their best work to a new company that was running on 2019, it looks like it would be slow, which seems counterintuitive as newer is usually faster.

    We know that the standard advice is to run with the QueryStore enabled and force the good plans back in when things regress, but we can't afford to even put this live in the state it's in, so we won't be able to gather enough data for it to be meaningful.

    Thanks for any input

    Kevin

    -------------------------------Oh no!

  • enable query store and get a good view onto what is going on.

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data and code to get the best help

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

    1. Set your db compatability to 110 (SQL2012)
    2. Enable Query Store and let your app run for a few days.
    3. Set your db compatability to 150 (SQL2019)
    4. Wait a bit a see what plans Query Store is enforcing.

    You now know what 2019 plans are inefficient so you will be in a position to work out what needs to change. You should probably read:

    Optimizing Your Query Plans with the SQL Server 2014 Cardinality Estimator

  • Your choices are pretty limited. Plans that are negatively affected by the 2014 Cardinality Estimation engine are generally marginal in terms of performance and behavior, prior to getting the new CE. Best solution, change the code and architecture so that it's no longer coming up with poorly performing plans. Second best is what the other two posters have suggested, enable query store with the old compatibility mode, then swap to the new compatibility mode and force plans (and then start working on reworking those queries and structures, just as in the first choice). Next option, you could try identifying the problematic queries on your own, no Query Store, and then generate plan guides to force the behavior. Good luck. Another option is to identify the problematic queries, then modify the code to supply query hints to force the use of the old CE. Last option is to try to force hints through the use of guides. Again, good luck.

    The best solution is to rework the code. And yeah, I know, that's not on the table. Second best is Query Store.

    Bottom of the stack, never change the compatibility mode, ever. Horrible choice, but one I've seen people make. You basically cripple performance of all other queries for a few that honestly need love anyway. Also, you limit the new functionality you can take advantage of.

    Just work on the code.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Just work on the code

    The issue with this seemingly trivial piece of advice is that we have no idea at this point what sacrifices to the god of SQL2019 we will need to make, and thus what to DO with the code that we're working on. To go through the plans query by query is going to take quite literally years, and then based on the lack of success we've had with one simple example, that might not help anyway.

    We have done all the suggested things (from people like Brent Ozar, Pinal Dave, Jeff Moden, and many others) over the last fifteen years to update and develop new SQL and make these things fast in SQL2000, then SQL2008, then SQL2012 so at this point with our SQL codebase of several hundred thousand lines of code, we're going to have to work out, then learn, then apply new ways of making things fast that just weren't required or didn't work in SQL2012, while presumably removing some of the stuff that did previously work.

    For example, we've spent a lot of time optimising the indexes and includes required for various workloads. We've swapped out table variables for temp tables where the cardinality of the table variable was significantly different from the default single row in pre-2019 editions. We've broken complex multi-part queries into smaller faster simpler pieces and put the data into indexed temp tables for subsequent parts to work on. We've got rid of basically every key lookup, every table scan, most index scans, all the non-sargable joins based on scalar functions, etc. Everything that I and my colleagues know to be best practice for performance, we have learned, we have taught to new staff, and we have done to all our systems. If SQL2019 throws out a load of those best practices and runs like a bag of mud unless we use an entirely new set, then I genuinely don't see how we will ever have the manpower and expertise available to us for long enough to do a full upgrade to the new features, and worst of all I've not been able to find a useful list yet of what we might have done to improve old versions, that needs undoing or changing in SQL2019, so initially at least there's going to be lots of guesswork and research as we rip out the speed tricks we've added and just see if we can go faster without them. I hate the whole idea of it, quite frankly.

    Even with the legacy cardinality estimator in place, it's 20% or more slower. Even in 110 compatibility mode, it's 20% or more slower. It just feels like they've broken it. We've had our DBA contractor trying to rewrite one of the pretty simple procedures to run quickly in 2019 with all the new stuff on, and he's got nowhere. Interestingly it was fast immediately after the restore of the 2012 DB when switched into 2019 mode with stats updated on the tables involved but as soon as the stats were updated on the indexes the performance fell off a cliff and went from 8 seconds (similar to how it is in SQL2012) to 2 minutes plus.

    We basically can't understand why it is that the best practices that we've applied previously suddenly appear to be null and void in the new cardinality estimator / SQL2019 optimiser world.

    Sorry to rant but we've burned man-weeks on this upgrade attempt already and I'm already sick to death of trying to circumvent stuff that is just objectively worse performing.

    Anyone got a useful reference to a list of low hanging fruit things to look for and change when moving existing performant code to the newer CE? I've found a white paper on it here, and I've found Brent Ozar's advice here to just potentially leave the old one turned on, though as I said that's still a 20% performance hit in some important places.

    • This reply was modified 2 years, 5 months ago by Kevin Gill. Reason: trying to fix a link

    -------------------------------Oh no!

  • @kevin Gill

    EDIT:  Sorry... I didn't read your last post where you said you've already tried the following.

    The issue is, almost certainly, the changes they made to the Cardinality Estimator in SQL Server 2014, which crushed us way back when we upgraded from 2012 to 2016.  There's a trace flag that we added to the SQL Server startup to set the entire instance to always use the "Legacy Cardinality Estimator" and it instantly fixed everything for us (and we didn't have the time to go looking for the issues nor would we have had the time to fix them) .

    IIRC, SQL Server 2019 now has that as an Instance Level and Database Level setting in Object Explorer as well as it being a setting that can be done using T-SQL. Post back if you need more info on that or anything else about this.

    • This reply was modified 2 years, 5 months ago by Jeff Moden. Reason: I didn't read your last post where you said you've already tried the following

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • p.s.  And, I agree... MS seems to follow a policy of "patch it until it breaks".  🙁  They're "It just runs faster" brag in 2016 broke some serious stuff for us when it comes to Temp DB and doing a massive insert to a new table that requires the use of SET IDENTITY INSERT ON.  We're getting ready to migrate from 2016 to 2022 and I'm scared to death, especially with all the automatic "fine tuning" of execution plans, etc, etc they now do.

    Another example is that if you use GENERATE_SERIES() as a row source for building a huge test table, it won't be minimally logged.  Just changing the row source back to fnTally fixes that.  Instead of generating 87GB of log file to create a 52GB table, fnTally takes it back down to just 600MB of log file in a true, minimally logged fashion.  Very fortunately, fnTally isn't automatically replaced with GENERATE_SERIES(). 😀

    I also used a Trace Flag to turn off "Fast Inserts" because it wasn't faster and it was causing a huge amount of "Allocated but Unused" file space because every bloody front end seems to favor the use of "Insert Bulk" (not to be confused with "Bulk Insert"), which ends up allocating an entire Extent to store {insert drum roll here} ONE bloody row every time it's used to insert one row.

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Kevin Gill wrote:

    we have one procedure that creates a table variable, inserts two rows, then does a query based on that and a bunch of other tables

    How many is "a bunch" in this case? SQL Server 2019 can choke trying to compile a stored procedure that includes a lot of JOIN operations in a single query. I have seen it take a couple minutes to compile and a couple seconds to execute. If such a stored procedure is restructured to reduce the number of JOIN operations in a single query then compile performance returns to normal.

    Creator of SQLFacts, a free suite of tools for SQL Server database professionals.

  • We are in the process of upgrading from 2014 to 2019 so both are using the same cardinality estimator. We are experiencing almost across the board a 6% to 10% performance penalty on all our SSIS loads in 2019. We have tried with our databases in 2014 compatibility as well as 2019 with no real differences. Both have the parallelism threshold set at the very low cost of 5.

    We've had our VM experts and service providers responsible for security software, and our DBA. all looking to make sure the host servers are configured the same. In fact, some exclusions have been added to the security software that have made the 2019 server perform better but it is still slower than 2014 where those changes have not been made.

    While trying to pinpoint a general cause, I have found certain steps that involve selecting from a columnstore table where the plan goes parallel on 2014 but not on 2019. This has been a good thing as the CE thinks there are 327 rows to fetch when in fact there are 32.7 million. Both execution plans show the same but 2014 chooses to go parallel and 2019 does not. I can make the latter perform far better by using the hint OPTION(USE HINT('ENABLE_PARALLEL_PLAN_PREFERENCE')) - in fact it then performs better than 2014 for that particular SELECT.

    Actually, if there really were only 327 rows then 2019 is picking the better plan. For this one, changing the compatibility of the database did fix the performance so it's not a VM configuration thing. Just flipping the compatibility back and forth caused the difference in the execution plans to manifest.

    In this particular case, I've been talking to Erland Sommerskog on Microsoft Q&A about the possible cause of the very poor estimate and I've addressed all the usual stuff around statistics. The probable cause is the use of predicates in views on one of the related dimensions pushing down and causing the optimiser to assume that many of the fact rows will be filtered out. That's bad design but I'm not in a position to fix that. There are plenty of areas where the poorer performance can't be pinned to something obvious like this, anyway, so it's not the answer.

    Like you, we have way too much code / too many packages to make individual changes. Our DBA has been running a simple CPU spin test on both and found the 2014 server performs much better than 2019. He has also run some tests using an open source DB, HammerDB. That showed the OLTP workload better on 2014 and OLAP better on 2019. Hopefully that means that once the data is in the warehouse, our users will get better performance for their reporting.

  • Just so you know, the only people who will see this, and maybe provide some answers, are ones who already subscribe to this thread, or find it by accident. If you want more eyeballs, and potentially more help, I'd suggest putting up your own post.

    However...

    Upgrades are always fraught with danger. This is even more true of something as persnickety as the query optimization engine within SQL Server. You're already on about the best path, talking to Erland, who is great at this stuff. The fundamental issue is simple, behavior is different in the newer versions. I won't say better. I won't say worse. The fact is, for some queries, it will be either.

    We're faced with hard choices. Never upgrade so you avoid the problem entirely. Then, never get new functionality or fix old problems, fall out of support, and possible run into issues there's no easy way out of as the technology stack ages. Or, upgrade and then start the wrestling match of finding out where your perfectly valid code is getting messed over by the new optimizer, or, your cruddy code which ran well enough, now doesn't, but hey, you get new functionality.

    The only thing I can tell you is that there isn't a magic switch. Hit this traceflag and all problems disappear. It is going to be about using tools like Query Store to help out where it can, and then attacking the code and structure where it can't help.

    And yes, I know that going after the code is not what anyone wants to do, or many cases, can do. I get that. But, we're back to our two choices. We don't upgrade, so we avoid these issues, or we upgrade, and then we must deal with the problem. And part of dealing with it, is, addressing issues in the code & structures (even though, yes, sometimes, you shouldn't have to).

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Does the 6% performance reduction pose a real operational problem at the system level?

    At the micro scale, small performance reductions may not matter to the individual process incurring it, but at the level of the entire SQL server it could provide benefits that reduce parallelism, conserving cores that other processes can use concurrently that previously would have had to queue up, or provide more accurate memory grants that don't blow out the plan cache, making it necessary to provide more ram, cpu and disk for everything else to perform adequately.

  • CreateIndexNonclustered wrote:

    Does the 6% performance reduction pose a real operational problem at the system level?

    At the micro scale, small performance reductions may not matter to the individual process incurring it, but at the level of the entire SQL server it could provide benefits that reduce parallelism, conserving cores that other processes can use concurrently that previously would have had to queue up, or provide more accurate memory grants that don't blow out the plan cache, making it necessary to provide more ram, cpu and disk for everything else to perform adequately.

    I guess my question is, why do we need to tolerate even that?  2019 is supposed to have new things in it that will make code run faster... and it doesn't.

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Just to add to the previous comments. I upgraded a system from SQL 2012 to SQL 2019 RDS on AWS. One of the most impactful actions after identifying slow-running queries was to use the "Database Engine Tuning Advisor" on them. Then, I implemented the recommended statistics which made a big difference to performance.

  • Jonathan AC Roberts wrote:

    Just to add to the previous comments. I upgraded a system from SQL 2012 to SQL 2019 RDS on AWS. One of the most impactful actions after identifying slow-running queries was to use the "Database Engine Tuning Advisor" on them. Then, I implemented the recommended statistics which made a big difference to performance.

    While I agree with that action, there should be no degradation in performance when migrating to a later version of SQL Server.  We've been crushed by our migration from 2016 to 2022.  Some simple testing of 2017 vs 2022 on the same computer shows a whopping 37% degradation for some relatively simple queries including a common one that uses no logical or physical reads or writes.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden wrote:

    Jonathan AC Roberts wrote:

    Just to add to the previous comments. I upgraded a system from SQL 2012 to SQL 2019 RDS on AWS. One of the most impactful actions after identifying slow-running queries was to use the "Database Engine Tuning Advisor" on them. Then, I implemented the recommended statistics which made a big difference to performance.

    While I agree with that action, there should be no degradation in performance when migrating to a later version of SQL Server.  We've been crushed by our migration from 2016 to 2022.  Some simple testing of 2017 vs 2022 on the same computer shows a whopping 37% degradation for some relatively simple queries including a common one that uses no logical or physical reads or writes.

    We had some SSRS reports that were taking about 6 seconds to run on the old 2012 environment. When they were run on the new 2019 environment they were taking over 1 minute, causing the web page to timeout. Adding the statistics recommended by Database Engine Tuning Advisor brought the runtime down to about 1 second. Other queries I had to rewrite, but nearly always managed to get them down to at least the runtime of the 2012 system. It's worth trying all the different methods of tuning the old queries. It is odd though how a new database upgrade with a better optimiser often just gives slower performance.

Viewing 15 posts - 1 through 15 (of 18 total)

You must be logged in to reply to this topic. Login to reply