shared & Exculsive latches?

  • Hi,

    PAGEIOLATCH_SH

    ---------------

    wiat_time_S 175094.26

    Pct 1.05

    Running_Pct 96.99

    PAGEIOLATCH_EX

    --------------

    wiat_time_S 119320.6

    Pct 0.71

    Running_Pct 98.53

    these shared & Exculsive latchs acquired in waits stats in cummulative number since restarted server,

    due to data reading DISK IO subsystem instead of reading data from buffer pool.

    Pl. suggestion me, what could be reason data is not read to buffer pool?

    both latches are reaching avg 10 ms.

    Need to add more memory?

    thanks

    ananda

  • Data is always read to the buffer pool. PageIOLatches happen when data is needed and is not in the buffer pool, hence has to be read from disk into the buffer pool.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • ok.. could you confirm the above wait_stats_s values normal or not? on OLTP database.

  • No idea. You're the only one who can say if they're normal. Are they close to wait times that your benchmarks for that server recorded?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Hi - As Gail stated, it is very difficult to say which is the culprit, without completely knowing your system.

    Few things that can be checked - total physical memory, RAID details, data file and log files location, tempdb location, number of cpu, max memory setting, Virtual memory size, max degree of parallelism etc,.

    I'll share my recent experience on working with similar kind of issue, Try it - if it is helpful.

    There are two instances on a server, one is OLTP and the another is OLAP. Total RAM size is 64 GB, RAID 5+ 1 subsystem is used. 16 core processor. PAGEIO_LATCH were huge.

    I checked the RAM usage, and enabled various counters for monitoring Disk usage, Memory usage etc. Observed a huge paging activity - then I found Virtual memory was configured 4 GB only (Gail only showed me some light on the dark area - Thanks again to Gail). It was increased to 110 GB. Then I collected the user queries through trace for a span of time - they were drilled down, checked its execution plan and found some issues like missing indexes, some queries were not properly picking the indexes/ right indexes. Everything was fixed, took almost 10 days. Then statistics was updated, then it started to work like a charm. Now PAGEIO_LATCH count is very very less.

    Most of the time index would be the culprit hence check it. Try to create a base for your server. From that try to figure out which goes wrong.

  • balasaukri (7/22/2013)


    then I found Virtual memory was configured 4 GB only. It was increased to 110 GB.

    If you mean the page file, then 4GB is probably fine and 110 probably wastefully high. SQL doesn't use the page file (and if windows forces SQL's memory into the page file, you have a much larger problem).

    On a dedicated SQL server, the page file only needs to be large enough for a crash dump, and that's a mini dump or kernel dump, not a full memory one.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Yes - I mean Virtual memory only! On a thinking that Microsoft recommends to keep the page file from 1.5 to 2 times of physical memory(in my case it is 64 GB), I just changed its size to 110 GB. I am unaware about the issue that you have stated now. I will revert it to 4 GB again and will monitor if any issue happens. Thank you.

  • Hi balasaukri,

    How many array controller have in your physical box? for the tempdb location, what is your RAID configuration? is it RAID 5 +1? it means RAID 5(striping with mirroring) and RAID 1(mirroring)... your tempdb located at which RAID?

    rgds

    ananda

  • I don't know how many array controllers! Regarding RAID - you are correct!! All the DBs are (including tempdb) in RAID 5+1 only.

  • 5+1? That's a very unusual RAID type. Don't think it would be that good for TempDB, 10 would be better, 5+1 would still have the write overhead from the parity stripe, just like plain RAID 5.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (7/24/2013)


    5+1? That's a very unusual RAID type. Don't think it would be that good for TempDB, 10 would be better, 5+1 would still have the write overhead from the parity stripe, just like plain RAID 5.

    I am not quite familiar with RAID concepts! And it is Clients environment, I can't do much on that. DEV server is on 5+1 and the PROD server is on RAID 10. I faced the performance problem in DEV server only. But after did the work around, as I mentioned in this chain, no query runs more than 30 secs. Performance is really very fine.

    Please suggest me some good reference to start with RAID. Thanks

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply