SQL Profiler running on server side causes high CPU usage

  • TheSQLGuru (8/7/2009)


    And if you take the problem to Microsoft, that is the first thing they will check on - are you on the latest service pack.

    Good point!

    Though actually the first thing they check is that you have paid for a support contract :laugh:

  • Were you originally tracing events like SP:StmtCompleted in the Stored Procedures category? Have you now changed to tracing events from the TSQL section like SQL:BatchCompleted?

    I ask since you didn't provide those details, and I have recently seen this combination of high CPU (across sixteen cores) and lots of CMEMTHREAD waits on a system where heavy use is made of scalar user-defined function calls.

    Paul, this is exactly what I was doing and the results were also exactly what you are describing. The scalar functions showing up in Activity monitor with lot of CMEMThread type waits.

    We have scheduled a upgrade to SP3 next week. Hopefully, that should fix things. Many thanks for your reply.

    Cheers!

    Anish

  • anish_ns (8/7/2009)


    Were you originally tracing events like SP:StmtCompleted in the Stored Procedures category? Have you now changed to tracing events from the TSQL section like SQL:BatchCompleted?

    I ask since you didn't provide those details, and I have recently seen this combination of high CPU (across sixteen cores) and lots of CMEMTHREAD waits on a system where heavy use is made of scalar user-defined function calls.

    Paul, this is exactly what I was doing and the results were also exactly what you are describing. The scalar functions showing up in Activity monitor with lot of CMEMThread type waits.

    We have scheduled a upgrade to SP3 next week. Hopefully, that should fix things. Many thanks for your reply.

    Cheers!

    Anish

    If you have scalar UDFs on your system that are causing slowdowns then eliminating those will likely be the only (or at least major) thing that truly improves overall performance. I have noted before and will note again that I have spent a significant amount of my time the last few years as a SQL Server consultant helping multiple clients refactor UDFs out of their system due to horrific performance problems. IMNSHO UDFs are the single worst thing MS has ever done to 'enable' devs to destroy performance.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • I'm going to have to second the statement about UDFs. Inline query UDFs are fine, but multi-value and scalar are usually just a good way to kill server performance.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • anish_ns (8/6/2009)


    Mmm! I never had a filter on textData...only enabled it on Duration and DBID.

    Actually, we haven't upgraded our build yet.....but, I did try the trace again after running DBCC Freesystemcache. Unfortunately, it still didn't work 🙁 got the same wait bottlenecks.

    The application works fine otherwise its only the trace that causes it.

    However, within the 2 minutes that I had the trace running, luckily I caught a couple of SP's which were hogging most of the resources. I guess if I can fix those first and worry about the trace problem later.

    Cheers!

    Anish

    When you are running the trace, do you see the increase in User connections? Did you notice Disk queue length going high as well. I have seen this happen if the the hardware is not configured properly.

    -Roy

  • TheSQLGuru (8/7/2009)


    If you have scalar UDFs on your system that are causing slowdowns then eliminating those will likely be the only (or at least major) thing that truly improves overall performance. I have noted before and will note again that I have spent a significant amount of my time the last few years as a SQL Server consultant helping multiple clients refactor UDFs out of their system due to horrific performance problems. IMNSHO UDFs are the single worst thing MS has ever done to 'enable' devs to destroy performance.

    I have some sympathy for that view - it sure is possible to abuse the UDF construct, and people used to procedural programming, or perhaps looking to re-use code, can easily be led astray.

    As usual, there are bad ways and good ways to use this feature, so it is too strong to suggest that using a scalar UDF is always a bad thing. I should also say that CLR scalar UDFs can produce quite startling performance gains, when used appropriately.

    Maybe the world would be a better place if scalar UDFs were not allowed to do local data access...?

  • Maybe the world would be a better place if scalar UDFs were not allowed to do local data access...?

    I have started giving my clients that very 'mandate' actually.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • I've actually built very performant scalar UDFs. The problem with them is when you start using them in join criteria or Where clauses, and kill any possibility of index use.

    Of course, if you build them complex enough (say, for example, with multiple table variables), you can make anything suck.

    But if you build a scalar UDF that's basically a select statement, and use it for a calculated column in a select statement, it can work quite nicely. Cross Apply is better, but that isn't available in Compat 80 or on SQL 2k.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • GSquared (8/10/2009)


    I've actually built very performant scalar UDFs. The problem with them is when you start using them in join criteria or Where clauses, and kill any possibility of index use.

    Of course, if you build them complex enough (say, for example, with multiple table variables), you can make anything suck.

    But if you build a scalar UDF that's basically a select statement, and use it for a calculated column in a select statement, it can work quite nicely. Cross Apply is better, but that isn't available in Compat 80 or on SQL 2k.

    How did you determine that your scalar UDFs were "very performant"? Have you run profiler at the batch/sp complete level to see that the actual IOs are way higher than set statistics IO on returns? Or that CPU usage is actually significantly higher than SSMS actual query plan reports? Have you run profiler with statement complete on to see the data now being processed row-by-row under the covers with absolutely no notification to SSMS of that fact? Have you compared a true set-based (not triangular join, etc) solution to actual UDF performance?

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • TheSQLGuru (8/10/2009)


    GSquared (8/10/2009)


    I've actually built very performant scalar UDFs. The problem with them is when you start using them in join criteria or Where clauses, and kill any possibility of index use.

    Of course, if you build them complex enough (say, for example, with multiple table variables), you can make anything suck.

    But if you build a scalar UDF that's basically a select statement, and use it for a calculated column in a select statement, it can work quite nicely. Cross Apply is better, but that isn't available in Compat 80 or on SQL 2k.

    How did you determine that your scalar UDFs were "very performant"? Have you run profiler at the batch/sp complete level to see that the actual IOs are way higher than set statistics IO on returns? Or that CPU usage is actually significantly higher than SSMS actual query plan reports? Have you run profiler with statement complete on to see the data now being processed row-by-row under the covers with absolutely no notification to SSMS of that fact? Have you compared a true set-based (not triangular join, etc) solution to actual UDF performance?

    Yes. I routinely keep traces running on dev databases that track all T-SQL batches and commands, specifically for performance comparison. In many cases, I keep them running on production databases and review them routinely to track performance issues. I don't do this in Profiler, I use server-side tracing without a GUI to slow it all down, saving the data to text files, to minimize the impact on the server, but it's the same concept as what you asked.

    I'm not sure where you get the idea that the UDFs I'm talking about involve triangular joins or any other form of "faux set-based" processing.

    Very specifically, I've used scalar UDFs in situations where complex calculations were needed on each of the returned rows of a set of data. Various solutions were tested, and in a few cases, scalar UDFs were either the best performer, or were only outperformed by code that was so complex it would be nearly impossible to document and maintain.

    In each case, it was in the Select clause, not in any of the Join/Where math.

    For example:

    I had to take lists of locations where events were being held, and large mailing lists of invitees, and assign each invitee to the five closest locations, sometimes with more business rules regarding crossing county or state lines, sometimes not. This could have been done with a UDF to calculate the five closest, but that would have been a matter of placing one in the Where clause or Join clause, so instead I created a table that did a semi-cross-join based on maximum distance and simple differences in latitude and longitude, then performed the necessary math on all rows all at the same time. Very set-based, no UDFs, and as fast as the floating point calculations could be performed by the CPU. That was a place to avoid them, so I did.

    Then, once we had results back in for the mailings, it was necessary to generate reports on who went to which events and how distant they were. In this case, there was nothing in the Join or Where clause, it was simply necessary to take the lat and long of the attendee's address and the lat and long of the location they attended at, and calculate the distance. While it was possible to include all that math directly in the query, it also (a) was needed in a dozen or so reports, each with its own query, and (b) made the queries almost unreadable because of the complex math needed. Thus, a UDF that performed this calculation was created, tested, and found to be quite acceptable in terms of performance, reusability, and code maintenance. There, I did use it.

    Those illustrate the places to use one, and the places to avoid one.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Clearly if your logic REQUIRES complex data access/processing PER ROW of output then a UDF is quite possibly the correct tool for the job. You are definitely correct that not all logic can be represented in a set-based manner or at least done so concisely/efficiently.

    The problem is that so few people out there realize that row-by-row processing is what they will get from UDFs with ancillary data access whether they want it or not and whether or not SSMS informs them that that is what the engine is actually giving them under the covers.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • TheSQLGuru (8/10/2009)


    Clearly if your logic REQUIRES complex data access/processing PER ROW of output then a UDF is quite possibly the correct tool for the job. You are definitely correct that not all logic can be represented in a set-based manner or at least done so concisely/efficiently.

    The problem is that so few people out there realize that row-by-row processing is what they will get from UDFs with ancillary data access whether they want it or not and whether or not SSMS informs them that that is what the engine is actually giving them under the covers.

    Completely agree on that one.

    Too many people pay too little attention to what they're telling the database to do. If they cooked the way they code SQL, they'd be covered in 3rd degree burns after every pot of soup.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • TheSQLGuru (8/10/2009)


    Clearly if your logic REQUIRES complex data access/processing PER ROW of output then a UDF is quite possibly the correct tool for the job. You are definitely correct that not all logic can be represented in a set-based manner or at least done so concisely/efficiently.

    The problem is that so few people out there realize that row-by-row processing is what they will get from UDFs with ancillary data access whether they want it or not and whether or not SSMS informs them that that is what the engine is actually giving them under the covers.

    Agreed also. I would have been extremely surprised if G-squared had been talking about a scalar UDF with data access.

    A schema-bound scalar UDF which does not do data access can be quite reasonable. In general though, I dislike the fact that they eliminate parallelism from the entire plan.

    I am often moved to create a CLR scalar function instead, since these are generally faster at compute-intensive operations, and support parallelism (unless a MAX datatype appears as an input parameter).

    Paul

  • Paul White (8/10/2009)


    TheSQLGuru (8/10/2009)


    Clearly if your logic REQUIRES complex data access/processing PER ROW of output then a UDF is quite possibly the correct tool for the job. You are definitely correct that not all logic can be represented in a set-based manner or at least done so concisely/efficiently.

    The problem is that so few people out there realize that row-by-row processing is what they will get from UDFs with ancillary data access whether they want it or not and whether or not SSMS informs them that that is what the engine is actually giving them under the covers.

    Agreed also. I would have been extremely surprised if G-squared had been talking about a scalar UDF with data access.

    A schema-bound scalar UDF which does not do data access can be quite reasonable. In general though, I dislike the fact that they eliminate parallelism from the entire plan.

    I am often moved to create a CLR scalar function instead, since these are generally faster at compute-intensive operations, and support parallelism (unless a MAX datatype appears as an input parameter).

    Paul

    Well, I think (hope) I prefaced my initial stomp on UDFs with that "ancillary data access" statement as the real bugaboo. Perhaps I missed it.

    Paul, you bring up other points in that you can still get issues if a) you leave out schema binding and b) your query would perform better with parallelism. Just more ways a UDF can harm performnace.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • TheSQLGuru (8/10/2009)


    Well, I think (hope) I prefaced my initial stomp on UDFs with that "ancillary data access" statement as the real bugaboo. Perhaps I missed it.

    Paul, you bring up other points in that you can still get issues if a) you leave out schema binding and b) your query would perform better with parallelism. Just more ways a UDF can harm performnace.

    You did mention it I think, or I assumed it. Just to explain: I often reply quoting text that prompted a line of thinking - I wasn't criticising your post. Apologies if it came off that way.

Viewing 15 posts - 16 through 30 (of 30 total)

You must be logged in to reply to this topic. Login to reply