Dataware Performance Degradation - Urgent Situation.

  • Thanks in advance !

    We are having a situation (PROBLEM) with our DW - SQL SERVER 2000 SP4 ENT X86, it suddenly slowed a lot.

    CPU Usage : 20%

    IO : AVR DISK QUEUE LENGHT Between 1.5 and 2 (rarelly spikes (20/40)).

    MEM : 8 GB, 6GB to SQL SERVER.

    All the code that is running against the DB is fined tuned ( Most tables are Index Seeked with Clustered Indexes ), biggest table 450 GB.

    No Lock Waits ( Except the main process(spid) that is showed has locked by it's own spid ( latch_xx (sh,ix)).

    I guess that the problem is related to storage, because it is only doing 1-3MB sec (sometimes 20MB/sec).

    PerfMon ( PROCESS SQLSERVER ( DISK Read Bytes ( between 1-3MB sometimes 20 MB/s), Disk Bytes Write Bytes/Sec same as Reads).

    No more sql server instances or other software hogging the server.

    I can't understand why the SQL SERVER is never using the IO it is used to use (25MB/S).

    I don't see any DISK QUEUE on the Partitions that are used to store the datafiles/logfiles)

    No Pages splits in the SQL SERVER ACCESS METHODS.

    Sometimes I get about 1 GB in dirty buffers.

    98 % Cache Hit Ratio.

    ALL points to IO contention, but the counters from windows don't indicate this.

    All help is REALLY appreciated.

    Best Regards.

  • Netic (4/2/2008)


    Thanks in advance !

    We are having a situation (PROBLEM) with our DW - SQL SERVER 2000 SP4 ENT X86, it suddenly slowed a lot.

    CPU Usage : 20%

    IO : AVR DISK QUEUE LENGHT Between 1.5 and 2 (rarelly spikes (20/40)).

    MEM : 8 GB, 6GB to SQL SERVER.

    All the code that is running against the DB is fined tuned ( Most tables are Index Seeked with Clustered Indexes ), biggest table 450 GB.

    No Lock Waits ( Except the main process(spid) that is showed has locked by it's own spid ( latch_xx (sh,ix)).

    I guess that the problem is related to storage, because it is only doing 1-3MB sec (sometimes 20MB/sec).

    PerfMon ( PROCESS SQLSERVER ( DISK Read Bytes ( between 1-3MB sometimes 20 MB/s), Disk Bytes Write Bytes/Sec same as Reads).

    No more sql server instances or other software hogging the server.

    I can't understand why the SQL SERVER is never using the IO it is used to use (25MB/S).

    I don't see any DISK QUEUE on the Partitions that are used to store the datafiles/logfiles)

    No Pages splits in the SQL SERVER ACCESS METHODS.

    Sometimes I get about 1 GB in dirty buffers.

    98 % Cache Hit Ratio.

    ALL points to IO contention, but the counters from windows don't indicate this.

    All help is REALLY appreciated.

    Best Regards.

    Hi,

    How full are your drives supporting database files? Has anything changed recently? Have you monitored locking? What does memory pressure look like? What does storage system look like (e.g. DAS, Local,NAS or SAN)? Any Anti-virus software installed on server? If so, have SQL Server files been excluded?

    Thanks,

    Phillip Cox

  • It's an EMC SAN.

    Thanks !

  • Any changes to HBA card(s) on server (e.g. driver update)?

    What is slow performance based on (e.g. reports taking longer)? How often do you update warehouse?

    Thanks,

    Phillip

  • Daily,

    Same data volume, but taking 4x the normal time.

    I don't have console access to the server...

    Complicated situation to expose here...

    Need more help please 😐

    Thanks !

  • Netic (4/2/2008)


    Daily,

    Same data volume, but taking 4x the normal time.

    I don't have console access to the server...

    Complicated situation to expose here...

    Need more help please 😐

    Thanks !

    What is taking 4x longer? Does the SQL Server service have "lock pages in memory" enabled?

  • The Inserts..

  • I can't understand why the server is using just 20-30% of Disk time...

  • Ok.

    Do you drop indexes prior to loading? What recovery mode are you using for target database? Have you checked transaction log usage and disk space where log resides?

    Thanks,

    Phillip Cox

  • Have your hardware folks taken a look?

    Did they start sharing any of the SQL LUN's with other applications? How's the IO looking on the SAN?

    We had something like that happen when someone added Exchange and SQl Server to the same arbitrated loop SAN. Let's just say their data access patterns didn't do so well with each other.

    Shared LUN's suck for this kind of reason.

    Also - are both HBA cards running? Are both channels up?

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • It is odd that inserts are taking that long but your not seeing any Perfmon counters indicating you are disk bound. I'd start using Sql Profiler and focus on high write processes. From there try to map the code to the tables and where they are at on disk. Then work with your storage team\vendor to determine if something in wrong at the hardware level. That seems like the most likely cause.

    Once you've identified the high write processes start profiling for those specifically and go to the statement level. If it is the code and not the hardware this will find it.

    Last thought, review your indexes and triggers to see if new ones have been added. You could be overindexed (e.g an index on every single column in the table) or triggers could be stretching out your transaction length. Tracing stored procs to the statment level will show you trigger execution.

    If you need help with the tracing, PM me and I'll try and lend a hand.

    David

  • Two more thoughts, check to is if your autogrowing in the database and\tempdb. You could be at max file size and processes are waiting on growth.

  • Nothing new on the Arquitecture.

  • Just 1 HBA and it's up.

  • Just 1 HBA.

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply