sp_monitor giving incorrect results on Multi Core Processor?

  • Hi there,

    I was just wondering if anyone else has had problems with the SP_monitor stored procedure giving incorrect results for CPU usage?

    It quite often reports that the CPU busy is over 100%, but perfmon counters would indicate otherwise. Infact the CPU never spikes over 5% according to the perfmon counter. Its a 4x Quad Core AMD opteron 8356 2.29GHz with 16Gb Ram

    A certain vendor is blaming this stored procedure for incrrect reporting of SQL CPU usage, stating that the SP_Monitorr procedure has problems with Multi Proc / Core processors.

    Is this true? anyone else had issues with the Proc?

    Edit: SQL server version is SQL Server 2005 SP3 on W2k3 r2 SP2.

  • No ones ever heard anything about this issue?

  • From BOL: "For each column, the statistic is printed in the form number(number)-number% or number(number). The first number refers to the number of seconds (for cpu_busy, io_busy, and idle) or the total number (for the other variables) since SQL Server was restarted. The number in parentheses refers to the number of seconds or total number since the last time sp_monitor was run. The percentage is the percentage of time since sp_monitor was last run. For example, if the report shows cpu_busy as 4250(215)-68%, the CPU has been busy 4250 seconds since SQL Server was last started up, 215 seconds since sp_monitor was last run, and 68 percent of the total time since sp_monitor was last run."

  • Im aware of how the procedure works. I do read the BOL too before posting here.

    So how can the CPU busy seconds be more than the elapsed time? If its been 10 seconds since the procedure was last run, how can the CPU have been busy for 15 seconds during this time?

    Has anyone experienced this? Or know why it can give over 100% results? And does it have anything to do with multi processor servers?

  • Ok, well, it seems that no one here has ever seen this before! Surprising really...

    I gathered some stats, and it seems that on our 16 processor box, all the values are 16 times bigger than they should be for idle and CPU busy time compared with the perfmon counters looking at the same thing.

    I dont know if this is an intended behaviour of the stored proc, but it seems you can get close to accurate results by dividing the values by the number of processors you have on your server.

    Hope this helps someone!

  • I know one vendors product IDERA (Diagnostic Manager) that makes use of this legacy SP.. IMHO it can provide inaccurate values, more evident on SMP systems, One issue is that if the @@CPU_BUSY function once it reaches a limit (134217727) it never increments higher - sp_monitor uses @@CPU_BUSY system function in its code

    (review results stored in master.spt_monitor)

    lastrun cpu_busy

    2010-03-04 13:21:29.400 134217727

    Note: other @@ functions return higher numbers ? (so appears to be specific issue with that system function.)

    Generally if I'm using SS 2005+ I will definetly want to make be using DMVs, these will provide more accurate

    Try the following example. (very usefull !) .try comparing these against your perfmon counters

    (Gives you a metrics @ minute intervals )

    DECLARE @ts_now BIGINT

    SELECT @ts_now = cpu_ticks / CONVERT(FLOAT, cpu_ticks_in_ms) FROM sys.dm_os_sys_info

    SELECT record_id,DATEADD(ms, -1 * (@ts_now - [timestamp]), GETDATE()) AS EventTime,

    SQLProcessUtilization,

    SystemIdle,

    100 - SystemIdle - SQLProcessUtilization AS OtherProcessUtilization

    FROM

    (

    SELECT

    record.value('(./Record/@id)[1]', 'int') AS record_id,

    record.value('(./Record/SchedulerMonitorEvent/SystemHealth/SystemIdle)[1]', 'int') AS SystemIdle,

    record.value('(./Record/SchedulerMonitorEvent/SystemHealth/ProcessUtilization)[1]', 'int') AS SQLProcessUtilization,

    TIMESTAMP

    FROM (

    SELECT TIMESTAMP, CONVERT(XML, record) AS record

    FROM sys.dm_os_ring_buffers

    WHERE ring_buffer_type = N'RING_BUFFER_SCHEDULER_MONITOR'

    AND record LIKE '% %'

    ) AS x

    ) AS y

    ORDER BY record_id DESC

  • Hiya

    If you check the MSDN topic for @@CPU_Busy - http://msdn.microsoft.com/en-us/library/ms186925.aspx - it says explicitly that "Result is in CPU time increments, or "ticks," and is cumulative for all CPUs, so it may exceed the actual elapsed time."

    I assume that this applies for multiple cores, and equally for sp_monitor, hence the inflated values which may exceed 100%.

  • Hi.

    I'm also interested in getting real CPU utilisation when more cores involved.

    I would divide the reported % by number_of_CPU that SQL may utilize.

Viewing 8 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply