Scalar UDF different performance on different server

  • Hi all,

    I have a query that calls on a UDF. In staging, has the same number of records and schema, it runs in 25 seconds. On production it used to run in seconds, but for some reason today it is very slow, to the point that if I let it run, it could run for a day or two. The execution plans are the same, the same indexes are being called, etc. The only difference I see is that the call to the UDF on production executes much slower (verified with Profiler) and the CPU usage on production is way, way high, compared to staging. Has anyone ever seen this sort of behavior??

    Thanks.

  • shahgols (11/7/2013)


    Hi all,

    I have a query that calls on a UDF. In staging, has the same number of records and schema, it runs in 25 seconds. On production it used to run in seconds, but for some reason today it is very slow, to the point that if I let it run, it could run for a day or two. The execution plans are the same, the same indexes are being called, etc. The only difference I see is that the call to the UDF on production executes much slower (verified with Profiler) and the CPU usage on production is way, way high, compared to staging. Has anyone ever seen this sort of behavior??

    Thanks.

    Are you absolutely, 100% positive that a) the code is exactly the same, b) the server builds are the same, c) the execution plans are the same (be sure to compare the XML and look for SET differences, etc)?

    I wrote a chapter for the SQL Server MVP Deep Dives 2 book entitled "Death by UDF". It is very aptly named and I STRONGLY encourage you to factor out the UDF and watch your queries run orders of magnitude faster!!!

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Hi Kevin, yes, I agree with you, UDFs are a nightmare for performance. I have replaced them before and have looked like a hero. haha. And when I take out the UDFs from the query, the query runs in 2 seconds flat! With UDFs, it may run for 1 or 2 days! Horrible!

    And your idea to compare the xml execution plan was right on. Most of the differences were too small (for example, EstimateCPU =0.000441494, EstimateCPU = 0.000440599), but the following sticks out:

    Staging server:

    <OptimizerHardwareDependentProperties EstimatedAvailableMemoryGrant="707788" EstimatedPagesCached="88473" EstimatedAvailableDegreeOfParallelism="2" />

    Production server:

    <OptimizerHardwareDependentProperties EstimatedAvailableMemoryGrant="353894" EstimatedPagesCached="176947" EstimatedAvailableDegreeOfParallelism="4" />

    Do you see any problems with such large differences? Any idea what I can do to get these numbers to match?

  • I am not at all convinced this is a plan thing. I would be more interested in waits on the production server. latching, locking, IO, CPU, etc.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • I'm with Kevin. It's very likely to be contention.

    But, from the differences you're seeing, I'd also suggest it's possibly statistics being different between the two. Those differences in estimations the plans can add up quite a bit.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • shahgols (11/7/2013)


    but for some reason today it is very slow.

    There must be any reason.. some immediate action done like

    1 data migration (heavy DML on related tables) which disturbed the statistics

    2 some indexe dropped (accidently)

    3 or any process running behind (not under consideration during this issue on prod).

    -------Bhuvnesh----------
    I work only to learn Sql Server...though my company pays me for getting their stuff done;-)

  • Hi everyone and many thanks for your responses.

    I had already checked on differences between table schemas, indexes, data, and also on wait/locks/etc. but didn't find any differences. I ran a update statistics statement in both environments with sample of 100%. I checked index fragmentation, which is about 5% on both servers. Any other checks I can do?

    One more question, based on the xml data I posted, it looks like the staging environment assigns a lot more RAM for the same query, isn't that correct? Could that be the reason why?

  • shahgols (11/8/2013)


    Hi everyone and many thanks for your responses.

    I had already checked on differences between table schemas, indexes, data, and also on wait/locks/etc. but didn't find any differences. I ran a update statistics statement in both environments with sample of 100%. I checked index fragmentation, which is about 5% on both servers. Any other checks I can do?

    One more question, based on the xml data I posted, it looks like the staging environment assigns a lot more RAM for the same query, isn't that correct? Could that be the reason why?

    Sure the memory grant can be important because it can lead to significantly more PHYSICAL IO as you get stuff spooled to disk for hashes, sorts, etc. What were the IOs for each query (and CPU)? I said you should check those. 🙂 NOTE though that you MUST use Profiler - the engine LIES to SSMS when UDFs are in play.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • I really appreciate your response and patience!

    If I run Profiler, the sql statement will never finish. haha

    I am using sp_whoisacitve. the CPU usage is much, muuuuch higher on production than it is on staging. And the "reads" column is somewhat higher. On staging, where the script runs in 25 seconds, the CPU shows 19K cycles, and reads shows 45K reads. On production, after 3 hours of running the same script, the CPU is at 9M cycles, and reads is at 64K reads.

    I hope that answers your question, otherwise let me know what to do (that doesn't involve Profiler) please.

  • This is a shot in the dark, but I am putting it out there just in case.

    I compared the actual execution plans and the XML of the staging execution plan contains these weird characters after the WHERE clause that don't exist in my query (or the query would obviously fail). Anyone know what these are?

    StatusId not in ('S','O','L') "

  • shahgols (11/8/2013)


    This is a shot in the dark, but I am putting it out there just in case.

    I compared the actual execution plans and the XML of the staging execution plan contains these weird characters after the WHERE clause that don't exist in my query (or the query would obviously fail). Anyone know what these are?

    StatusId not in ('S','O','L') "

    Just CR/LF pairs ...

    MM



    select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);

  • Forum Etiquette: How to post Reporting Services problems
  • [/url]
  • Forum Etiquette: How to post data/code on a forum to get the best help - by Jeff Moden
  • [/url]
  • How to Post Performance Problems - by Gail Shaw
  • [/url]

  • You are right, that's what they were.... 🙁

    I was hoping they were something more important, and they would be the reason why my query runs faster in staging. 😛

  • What does the UDF in question do? It may be pertinent.

    MM



    select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);

  • Forum Etiquette: How to post Reporting Services problems
  • [/url]
  • Forum Etiquette: How to post data/code on a forum to get the best help - by Jeff Moden
  • [/url]
  • How to Post Performance Problems - by Gail Shaw
  • [/url]

  • TheSQLGuru (11/8/2013)


    the engine LIES to SSMS when UDFs are in play.

    please elaborate. Or ANy article reference.

    -------Bhuvnesh----------
    I work only to learn Sql Server...though my company pays me for getting their stuff done;-)

  • shahgols (11/8/2013)


    I really appreciate your response and patience!

    If I run Profiler, the sql statement will never finish. haha

    I am using sp_whoisacitve. the CPU usage is much, muuuuch higher on production than it is on staging. And the "reads" column is somewhat higher. On staging, where the script runs in 25 seconds, the CPU shows 19K cycles, and reads shows 45K reads. On production, after 3 hours of running the same script, the CPU is at 9M cycles, and reads is at 64K reads.

    I hope that answers your question, otherwise let me know what to do (that doesn't involve Profiler) please.

    My guess is that you are running Profiler with STATEMENT completed as opposed to BATCH completed events. NEVER run profiler with STATEMENT completed if you have cursors, UDFs, etc in your production system. BATCH completed won't cause the query to run slowly by trying to capture every execution of the guts of the UDF like STATEMENT completed will.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Viewing 15 posts - 1 through 15 (of 29 total)

    You must be logged in to reply to this topic. Login to reply