Performance issue on DB

  • Hi,

    We have a sql 2005 prod server,where one of the DB became very slow past 5 days.

    i have observer that blocking happening @ regular intervials from past 3 days.

    and also i have checked fragmentation and some the non cluster indexs and heap ,avg_fragmentation_in_percent

    is having more than 40% .

    Can you please help us hw to troubleshoot this issue.

    Many THanks.

  • The source of the problem could be on the SQL server side ( bad queries, stale statistics or fragmented indexes, unexpected data growth, etc.)

    or on the OS/hardware side (disk/IO issues, power saving features were turned on on your server, etc.)

    Do you do regular index and statistics maintenance?

    During periods of slowness and blocking is CPU maxed out or do you have disk queue build up or both?

    Are you seeing unusual growth in databases (user or tempdb, data or log files)?

  • Performance issues can be caused by so many factors from hardware issues, inefficient code, out of date statistics, bad execution plans the list could go on and on.

    You need to start with the basics:

    1. Using performance monitor create a counter log to capture key hardware metrics, such as CPU utlisation, reads, writes, processor queue length, disk queue lengths etc.

    For example:

    Memory Counters

    i.Memory Available Mbytes, at least 200MB should be free.

    ii.Memory Pages/sec below 20 preferably below 15

    iii.SQL Server: Buffer Manager: Buffer Cache Hit ratio should exceeed 90% preferably 99%

    iv.SQL Server: Buffer Manager: Page Life Expectancy, should exceed 300, the higher the value the better.

    Processor Counters

    i.Processor: %Processor Time: Should Average Below 75% preferably below 50%

    ii.System: Processor Queue Length: Should average below 2 per processor, For example is a 2 processor machine it should remain below 4.

    I\O Bottlenecks

    i.Physical Disk - % Disk Time should average below 50%

    ii.Physical Disk – Avg Disk Queue Length. Should average below 2 per disk. For example for an array of 5 disks this figure should be below 10.

    iii.Physical Disk – Avg. Disk Reads/sec. Should be below 85% of the IOPS value of the disk

    iv.Physical Disk – Avg. Disk Writes/sec. Should be below 85% of the IOPS value of the disk

    2. Create a server side trace to capture the calls made to the database and write them to a file, as I have no way of knowing how you connect to or use your database I would start with the events RPC_completed.

    Once you have your trace you can use the code simliar to below to find the top consuming queries by reads, or number of executions

    -- variables

    declare @trace nvarchar(1000), @filter nvarchar(1000)

    select @trace ='E:\Traces\output_20120125_1826\sp_trace.trc'

    -- read raw data

    select *

    into #trace

    from ::fn_trace_gettable(@trace, DEFAULT)

    where TextData is not null

    and CONVERT(varchar(100), textdata) <> 'exec sp_reset_connection'

    -- extract according to filter

    select CASE WHEN TextData LIKE '%exec %' THEN CONVERT(varchar(1000), TextData) ELSE NULL END AS ProcName, HostName, cpu, reads, starttime, CONVERT(varchar(1000), TextData) AS TextData, Duration

    into #procs

    from #trace

    where textdata like case when @filter is not null then '%' + @filter + '%' else textdata end

    -- Deletes

    DELETE #procs WHERE textData LIKE '%msdb.dbo.%'

    DELETE #procs WHERE TextData = ' if @@trancount > 0 commit tran'

    DELETE #procs WHERE TextData = 'begin tran'

    -- reformat

    update #procs

    SET procname = SUBSTRING(procname, CHARINDEX('exec dbo.', ProcName), 1000)

    WHERE CHARINDEX('exec dbo.', ProcName) > 0

    update #procs

    set procname = REPLACE(ProcName, 'exec dbo.', '')

    update #procs

    set procname = LEFT(procname, charindex('@', ProcName)-1)

    where CHARINDEX('@', procname) > 0

    -- summary results by proc

    select top 10 procname, COUNT(*) as calls, AVG(cpu) AS AvgCPU, AVG(Reads) AS AvgReads, MIN(starttime) AS [Start], MAX(starttime) AS [End], AVG(Duration) AS AvgDuration

    from #procs

    group by procname

    order by (COUNT(*) * AVG(Reads)) DESC

    -- results for top 5 sprocs

    select *

    from #procs

    where procname in (

    select top 1 procname

    from #procs

    group by procname

    order by (COUNT(*) * AVG(Reads)) DESC

    )

    order by StartTime

    -- cleanup

    DROP TABLE #trace

    DROP TABLE #procs

    MCITP SQL 2005, MCSA SQL 2012

  • Thanks for reply,

    Blocked queries that are hitting the table ,contains non cluster indexes which is have fragmentation(avg freg colum >50%)....plz clarify whether rebuild is best option for troubleshooting the issue.

    advance thanks.

  • Best option, no. If management is screaming that 'something must be done', then you can rebuild but it probably won't have a noticable effect (unless the problem was due to stale statistics)

    http://www.simple-talk.com/sql/performance/finding-the-causes-of-poor-performance-in-sql-server,-part-1/

    http://www.simple-talk.com/sql/performance/finding-the-causes-of-poor-performance-in-sql-server,-part-2/

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • RTaylor2208 (7/15/2013)


    iii.SQL Server: Buffer Manager: Buffer Cache Hit ratio should exceeed 90% preferably 99%

    Buffer cache hit ratio is a near-useless counter. By the time it shows a noticable drop, the server would have been under cripple memory pressure for some time

    iv.SQL Server: Buffer Manager: Page Life Expectancy, should exceed 300, the higher the value the better.

    Oh not that 300 nonsense again. 300 sec for an average page to stay in cache was borderline too low when the figure was released 8 odd years ago, at a time when 4GB of memory was a lot for a server.

    These days when 64GB of memory is small and servers with over 1TB of memory are known, suggesting that it's acceptable to have the entire buffer pool displaced and the entire size of the buffer pool read off disk every 5 minutes is just wrong, there are very, very few IO subsystems that could handle 64GB being read every 5 minutes.

    Buffer cache hit ratio should be within normal range of the server in question, as identified by benchmarking. If a threshold is wanted, then a bare minimum could be (300/4)*(max server memory in GB)

    i.Processor: %Processor Time: Should Average Below 75% preferably below 50%

    Why would I want half my processors to be doing nothing? If I have a server where the average CPU is under 50, that's a candidate for consolidation so that I can get higher hardware utilisation.

    ii.Physical Disk – Avg Disk Queue Length. Should average below 2 per disk. For example for an array of 5 disks this figure should be below 10.

    Not true for a number of reasons. Firstly, unless you have direct attached storage, there's too much between the machine and the disks to come up with any reasonable threshold. The second reason is that SQL is designed to intentionally queue up lots of IOs, it can and will intentionally, as part of normal operation, drive the disk queue length very high.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 6 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply