Differences in Query execution plan between 2 servers

  • Hi all,

    I have a server running on Amazon which was duplicated on Data Center. Databases are identical, statistics is uptodate, indexes are not fragmented. Query takes 4 min on 1 server and never come out on another. I tried re-writing query, but nothing helps. Execution plan is showing 90-98% of sorting or hash match. I tried to put XML of the query from the "good" machine and force the 2nd machine to use it - this did work. But the moment I remove it, the performance is bad again. The question is - what kind of settings can cause SQL Server 2008 R2 to choose wrong execution plan and what needs to be changed?

    Thanks.

    Irina

    P.S. Both plans are attached. Amazon_machine is the good one and data_center is the bad one

  • Your best bet is to save both execution plans and post them here. Otherwise, it will be hard to tell.

    Jared
    CE - Microsoft

  • Looking at the plans, the first thing that jumps out is that the estimates are different between the two plans. This suggests that you either have completely different data sets or completely different sets of statistics for the same data set. It has to be one or the other. The optimizer works from the statistics it has and yours are absolutely different.

    That said, one plan is parallel and the other is not. Do you have different cost threshold for parallelism between the two servers? Do the two servers have different numbers of CPUs? Does one of them have affinity or something else along those lines different than the other?

    You also have a warning on the temp table on the poorly performing plan. It's missing statistics. Are auto-update stats turned off on that server?

    Those are the initial things I can see that you can check into to see why you're getting such different plans between the two servers.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Thank you very much for your suggestions.

    Please see attached the new execution plan after parallelism was turned on.

    The dataset is absolutely the same, the number of processors - 4 on a "good" machine and 8 on a "bad"

    I am not sure if autostatistic is on, but I run stats collection and it said it was not needed and no changes to execution plan.

  • One is estimating 230000 rows and the other is estimating millions. That is either a difference in data or in stats on the data. Must be. No other options.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Grant Fritchey (7/31/2012)


    One is estimating 230000 rows and the other is estimating millions. That is either a difference in data or in stats on the data. Must be. No other options.

    Or a slight difference in the query resulting in an accidental crossjoin(s)? To the OP, I'm not suggesting you are lying about the queries being exactly the same, its just that we all make mistakes (from time to time lol) and there are few other explanations.

    Jared
    CE - Microsoft

  • We run it so many times, 2 people. I wish I would make a mistake 🙁 Just run out of options.

    I tried yesterday to run a query with an execution plan from the "good" one and it did work, but for some unknown reasons SQL Server is choosing the wrong plan when it runs query as it is, even i tried to put tables in a different order in the "from" clause. We checked settings for the sql server and it is the same. I checked auto-update statistic option and it is set to "true".

  • SQLKnowItAll (7/31/2012)


    Grant Fritchey (7/31/2012)


    One is estimating 230000 rows and the other is estimating millions. That is either a difference in data or in stats on the data. Must be. No other options.

    Or a slight difference in the query resulting in an accidental crossjoin(s)? To the OP, I'm not suggesting you are lying about the queries being exactly the same, its just that we all make mistakes (from time to time lol) and there are few other explanations.

    Could be, but I compared the two queries. They look the same. I'm still on stats. Gotta be.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • As I mentioned we did run statistics and it didn't help. The other difference is that data center machine is clustered (3 physical machines with 3 sql server instances). May some settings be involved?

  • It does sound like stats one way or another, but as your non-ANSI joins are not simple equality checks, you could easily end up with a more stable plan by converting these to ANSI joins. I've given a quick example below - please check the results before using:

    SELECT pixel ,

    syndicationOutletId ,

    outlet ,

    geo.[metro-code] ,

    dma.metroName ,

    dma.regions ,

    SUM(imps) impressions ,

    COUNT(imps) uniques

    FROM #temp4 t

    INNER JOIN VINDICOSUITE_NETACUITY.dbo.GeoDatabase geo WITH ( NOLOCK) ON t.ipNumber BETWEEN geo.ipFrom AND geo.ipTo

    INNER JOIN VINDICOSUITE_NETACUITY.dbo.Metro dma ON geo.[metro-code] = dma.metroKey

    WHERE geo.[metro-code] NOT LIKE 'metro-code0'

    GROUP BY syndicationOutletId ,

    geo.[metro-code] ,

    dma.metroName ,

    outlet ,

    pixel ,

    dma.regions

    I'd also suggest not using the NOLOCK clause if you can help it. It's pretty dangerous unless you don't mind the results being inaccurate.

  • iashurova (7/31/2012)


    As I mentioned we did run statistics and it didn't help. The other difference is that data center machine is clustered (3 physical machines with 3 sql server instances). May some settings be involved?

    '

    But how else can you explain the differences in the number of estimated rows? Either there is different data between the two machines or different statistics. That assumes the query is the same (which it sure seems to be).

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Just tried it with and without (nolock) option - doesn't make any difference. The problem in my eyes that SQL Server is trying to join #temp table with one of the permanent tables first. In the good plan the 2 permanent tables are joined first and then temporary table. The only way to get it work is to change the order of tables in "from" clause and force SQL Server to use it. But this is just a way around. I need to find the reason for such a bad execution path and fix it, we can't re-write every query...

  • Are both tempdb databases configured the same on both servers?

    Jared
    CE - Microsoft

  • Just tried it with and without (nolock) option - doesn't make any difference. The problem in my eyes that SQL Server is trying to join #temp table with one of the permanent tables first. In the good plan the 2 permanent tables are joined first and then temporary table. The only way to get it work is to change the order of tables in "from" clause and force SQL Server to use it. But this is just a way around. I need to find the reason for such a bad execution path and fix it, we can't re-write every query...

  • I tried it with nolock and without, but it didn't have any impact on a bad query plan 🙁 the only improvement I saw when i put the table in the correct order and force SQL Server to use it. It is a way around, but it doesn't solve the mystery - why does SQL Server is choosing the wrong plan.

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply