Server Performance

  • Hi,

    Attached are performance stats for server 1. Server1 is connected to a SAN drive. Performance over the last year or so has been good. 3 weeks ago user machines started hanging. I noticed that the logs and the data files for the databases attached were on the same SAN drive, so i moved the log files to a different drive(local to the server) and kept the data on the SAN. I generated the stats attached. What could be the probem? Is it the SAN drive which is the problem? Do more disks need to be added to the raid set? Am i reading this right?

    Please help.

  • What does hanging means here, is it the whole application or the part of the application, and if yes, then what part of the day...

    if it is at specific time, check if any background processes are running at the same time, monitor the CPU usage.

  • You can also check sp_who2 and see if there is any blocking happening!

  • Hi

    What RAID are u using, what's the SQL Configuration, what's the performance issue you are facing, since then some one can suggest that the counters that you have provided is fine or we need to look into some more counters.

    More over does the counters remains same most of the times???

    Ouch so many Questions ( ha ha ha)

    Cheers

    🙂

  • What do the following two counters look like?

    Physical disk: Avg sec/read

    Physical disk: Avg sec/write

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • The Avg. disk sec/read is between 0.005 to 0.020. The Avg disk sec/write is between 0.000 and 0.008. When i meant hanging, i was talking of the call centre application that people use when working with the databases on the server. I used sp_lock when the problem started and there were no locks then. There are no locks now. No background processes seem to be running or massive jobs. The application hangs at intermittent times during the day but i know for a fact that from 3.45pm in the afternoon to around 5pm, this problem occurs as some jobs that run during that time are slowed down.

  • Those look good.

    Can you monitor the stats as well as watching for blocking during the known slow time?

    Perhaps also run profiler for a while and see if you can correlate random slowdowns to certain queries running against the server.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • I have attached a trace for you to help me with. I did this trace on the 16th of september this month when the problem was happening. I only put the maximum number of rows in the Excel sheet.

    To elminate the SAN drive as the problem, I took one database and moved it to 2 local drives on the server(one for the data and the other for the log). Users worked ok for the day or so. The jobs that ran at 4 ran quick. A couple of minutes later, perormance started to go haywire on both the SAN(where the majority of the databases are) and on the data drive of the database that i had moved. I could not identify any blockages.

  • No flame wars please, but in my experience, as a rule of thumb, never never use SAN for databases.

    NAS on the other hand is great!

    The network stack is not the right place to be putting that kind of I/O.

    I have no doubt there are some wonderfully clever people out there who can make SANs work for databases, but my experience over the last 10 years with this kind of kit has been that NAS > Direct attached disk > SAN

  • Hi,

    Just to add to the problem i raised and which still continues, machines started hanging this morning. I took a caption of the sp_who2 results. The results are attched. Ignore those spids that are blocking others because for me thats not where, in my opinion, the problem is. Thats not were my issue is. I noticed that there yellow highlighted row is consuming a lot of disk time and cpu time. What is checkpointing? Is the process being done on the disk and how can it be avoided in any way while people are workin?

    Thanks

    Tendayi

  • Hi

    This sounds Crazy, is that an in House application, if yes was there any changes made in the past week, since i have experienced the same problem, no offence, but mostly developers don't take database load into consideration, is that a multi thread application??

    Cheers

    🙂

  • I would put the profiler on around the time and write this to a table and see what is happening at the exact time.

    Is the SQL going down or just connections not reaching the database.

    There are some black box logs you can set up too to see what is happening too.

    --Trace Files

    SELECT * FROM SYS.TRACES

    --Set up the BLACK BOX

    --Run this first to create the BlackBox Trace

    DECLARE @TraceId int

    DECLARE @maxfilesize bigint

    SET @maxfilesize = 25

    EXEC sp_trace_create

    @TraceId OUTPUT,

    @options = 8,

    @tracefile = NULL,

    @maxfilesize = @maxfilesize

    EXEC sp_trace_setstatus @TraceId, 1

    --Run this to create the Stored Procedure

    USE master

    GO

    CREATE PROCEDURE StartBlackBoxTrace

    AS

    BEGIN

    DECLARE @TraceId int

    DECLARE @maxfilesize bigint

    SET @maxfilesize = 25

    EXEC sp_trace_create

    @TraceId OUTPUT,

    @options = 8,

    @tracefile = NULL,

    @maxfilesize = @maxfilesize

    EXEC sp_trace_setstatus @TraceId, 1

    END

    GO

    --Run ths command to start on server startup

    EXEC sp_procoption

    'StartBlackBoxTrace', 'STARTUP', 'ON'

    --Use this to get the details of the black box

    SELECT * FROM fn_trace_gettable('C:\Program Files\Microsoft SQL Server\MSSQL.7\MSSQL\DATA\blackbox.trc', default);

    GO

    Change above to the path you set this up on.

    This can help to some degree to see what going on.

    Is there anything in logs .

  • There some large durations here

    exec PaymentFrequencyGetAll

    and forsa2000000

    exec [c3sabc]..sp_procedure_params_rowset N'ContactNumberUpdate',1,NULL,NULL

    exec [c3vvm]..sp_procedure_params_rowset N'DebtorDetailGet',1,NULL,NULLC3C3User24443056

    This seems like it taking forever to get the data from the SAN to me..............

    Has anyone ran a fragmentation on the SAN disks at all to see how badly it is fragmented.

    Why are these durations so high. I guess that where i would start.

  • Checkpointing is a process where all the "dirty pages" (changed data) are written to the disk. It is described fairly well in the SQL help.

    The main cause of issues around this seem to be Long Running Transactions. When they finally commit they can cause issues with checkpoints. It may be worth reviewing any transactions that are active for more than a few minutes.

  • Hi,

    I ran the procedure which you sent. It showed me a number of queries that had been run, their start time but with no end time. I also picked up entries where the severity level was 15 and 16. If the entries don't have an end time, what does that mean?

    The server with the problem was last defragged 2 weeks ago so that rules out that option.

    Thanks

    Tendayi

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply