Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase ««12345»»»

Page life expectancy diving to sub-10 on a 128 GB server Expand / Collapse
Author
Message
Posted Thursday, March 7, 2013 10:35 PM


Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Tuesday, September 9, 2014 2:05 AM
Points: 316, Visits: 1,142
SQLRNNR (3/7/2013)
Hardware reserved, is that in MB or GB that you are showing?

NUMA - did you make any changes to CPU affinity or is the default setting still active (SQL Server managing)?

HowardW asked a question concerning Buffer Node v Buffer Manager in regards to which PLE you are monitoring. Which one is reporting the erratic stats?



Sorry, hardware reserved is 64 MB.

NUMA is default.

Pefmon node breakdowns hows PLE on node "000" as 38, node "003" as 43 right now (ie, seems both are equally low)

Gail:

Index rebuilds are set for Sunday nights only (and don't appear in sp_whoisactive when I see PLE plummit). CheckDB is not run against this system, I run it on the DR site.


allmhuran.com - download the SSMSDeploy addin for SSMS 2008
Blog on sqlservercentral
Post #1428389
Posted Thursday, March 7, 2013 10:37 PM


SSC-Insane

SSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-Insane

Group: General Forum Members
Last Login: Yesterday @ 10:58 PM
Points: 21,755, Visits: 15,461
What are the maximum values you have seen for PLE on each node?



Jason AKA CirqueDeSQLeil
I have given a name to my pain...
MCM SQL Server


SQL RNNR

Posting Performance Based Questions - Gail Shaw
Posting Data Etiquette - Jeff Moden
Hidden RBAR - Jeff Moden
VLFs and the Tran Log - Kimberly Tripp
Post #1428391
Posted Thursday, March 7, 2013 10:40 PM


Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Tuesday, September 9, 2014 2:05 AM
Points: 316, Visits: 1,142
I haven't been tracking PLE by node before now, but the max I've seen for the buffer manager counter is plenty high: 50,000.

If I graph the this counter over the last week I see a dozen-and-a-half or so precipitous drops from very high values (20,000+) down to virtually zero. There's no obvious pattern to the timing, even if I round to the nearest hour. Sometimes it's at around 4 am (start of ETL's), but it also can occur at seemingly any time day or night.


allmhuran.com - download the SSMSDeploy addin for SSMS 2008
Blog on sqlservercentral
Post #1428394
Posted Thursday, March 7, 2013 10:48 PM


SSC-Insane

SSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-Insane

Group: General Forum Members
Last Login: Yesterday @ 10:58 PM
Points: 21,755, Visits: 15,461
Is SQL Server the only thing on this box or do you have other apps and things like AntiVirus installed on this box too?



Jason AKA CirqueDeSQLeil
I have given a name to my pain...
MCM SQL Server


SQL RNNR

Posting Performance Based Questions - Gail Shaw
Posting Data Etiquette - Jeff Moden
Hidden RBAR - Jeff Moden
VLFs and the Tran Log - Kimberly Tripp
Post #1428395
Posted Thursday, March 7, 2013 10:52 PM


Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Tuesday, September 9, 2014 2:05 AM
Points: 316, Visits: 1,142
All SQL baby!

The next highest memory consumer (right now, per resource monitor) is the DTS service with a commit charge of 353 meg (this is a 2 node clustered instance on a 4 node windows cluster).

Integration services 2008 is (unfortunately) installed in order to run half a dozen legacy packages that weren't migrated prior to the big server move, but the packages are tiny and execute in seconds/minutes, and only run once a day at around 2am.

Cheers for the continued attention BTW. I hope you're as confused as I am


allmhuran.com - download the SSMSDeploy addin for SSMS 2008
Blog on sqlservercentral
Post #1428396
Posted Thursday, March 7, 2013 11:07 PM


SSC-Insane

SSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-Insane

Group: General Forum Members
Last Login: Yesterday @ 10:58 PM
Points: 21,755, Visits: 15,461
Ok, does the same thing happen on each node of the cluster? Just trying to get a wider picture.



Jason AKA CirqueDeSQLeil
I have given a name to my pain...
MCM SQL Server


SQL RNNR

Posting Performance Based Questions - Gail Shaw
Posting Data Etiquette - Jeff Moden
Hidden RBAR - Jeff Moden
VLFs and the Tran Log - Kimberly Tripp
Post #1428399
Posted Friday, March 8, 2013 2:59 AM


SSC-Forever

SSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-Forever

Group: General Forum Members
Last Login: Yesterday @ 3:43 PM
Points: 43,047, Visits: 36,206
sys.dm_os_wait_stats
CXPACKET is at the top (15%), followed by...
SP_SERVER_DIAGNOSTICS_SLEEP
XE_TIMER_EVENT
HADR_FILESTREAM_IOMGR_IOCOMPLETION
DIRTY_PAGE_POLL
BROKER_EVENTHANDLER
XE_DISPATCHER_WAIT
LOGMGR_QUEUE


Most of those are meaningless waits, background processes that are supposed to be waiting. That list says nothing of any value.

CheckDB is not run against this system, I run it on the DR site.


What is your DR technology? How does the data get to DR?

There are very few DR methods that allow you to run CheckDB against your DR server and effectively check your production DB.

Virtual machine? If so, make sure that the memory's not overcommitted.

Any entries in the error log about cache flushes?



Gail Shaw
Microsoft Certified Master: SQL Server 2008, MVP
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Post #1428471
Posted Friday, March 8, 2013 5:13 AM


Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Tuesday, September 9, 2014 2:05 AM
Points: 316, Visits: 1,142
GilaMonster (3/8/2013)
[quote]sys.dm_os_wait_stats
Most of those are meaningless waits, background processes that are supposed to be waiting. That list says nothing of any value.


Yep, that's kinda the point I'm making by noting that iolatch is down at 12th under all of this. :)

The cxpacket waits are also pretty standard for the environment, as in they also existed on the old server. It's a product of less than ideal code in the ERP database.


What is your DR technology? How does the data get to DR?
Any entries in the error log about cache flushes?


The DR site is maintained by restores from production SQL Server backups. No VM or SAN magic at play. I set checkdb to run there last week after seeing this pattern of activity, normally it would run locally on sunday night after reindexing and before the weekly full backup (diffs daily at 9pm, hourly log). Using a wrapped up version of Ola Hallengren's stored proc based solution for maintenance.

Ok, does the same thing happen on each node of the cluster? Just trying to get a wider picture.


There's a second cluster that fails over in the other direction that runs on the secondary node (it's the "live training" system, enterprise licensed). It gets much less activity than the production ERP. That instance is holding a steady PLE of over 60,000.


allmhuran.com - download the SSMSDeploy addin for SSMS 2008
Blog on sqlservercentral
Post #1428530
Posted Friday, March 8, 2013 5:20 AM


SSC-Forever

SSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-Forever

Group: General Forum Members
Last Login: Yesterday @ 3:43 PM
Points: 43,047, Visits: 36,206
allmhuran (3/8/2013)
GilaMonster (3/8/2013)
[quote]sys.dm_os_wait_stats
Most of those are meaningless waits, background processes that are supposed to be waiting. That list says nothing of any value.


Yep, that's kinda the point I'm making by noting that iolatch is down at 12th under all of this. :)


You're missing my point. Those waits are ones you ignore completely. Saying that IOLatch waits are 12th under that list means they're the second top useful wait in the system. Those background waits are normal and expected to be high, they're from processes that spend most of their time doing nothing, so you just filter them out of any wait analysis.



Gail Shaw
Microsoft Certified Master: SQL Server 2008, MVP
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Post #1428531
Posted Friday, March 8, 2013 5:25 AM


Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Tuesday, September 9, 2014 2:05 AM
Points: 316, Visits: 1,142
GilaMonster (3/8/2013)
allmhuran (3/8/2013)
GilaMonster (3/8/2013)
[quote]sys.dm_os_wait_stats
Most of those are meaningless waits, background processes that are supposed to be waiting. That list says nothing of any value.


Yep, that's kinda the point I'm making by noting that iolatch is down at 12th under all of this. :)


You're missing my point. Those waits are ones you ignore completely. Saying that IOLatch waits are 12th under that list means they're the second top useful wait in the system. Those background waits are normal and expected to be high, they're from processes that spend most of their time doing nothing, so you just filter them out of any wait analysis.


No, I get your point, I'm just making it explicit that there's nothing particularly unusual at the top end of the list.

Regarding log entries: Yes, there are lots of cache flush messages, I understand this is now on by default for the checkpoint process to log without a trace flag.
- "FlushCache: cleaned up 2037 bufs with 1673 writes in 86339 ms (avoided 1002 new dirty bufs) for db 7:0"
- "average throughput: 0.18 MB/sec, I/O saturation: 3196, context switches 7021"

But I'm curious where you might be going... are you thinking a recovery interval issue, or perhaps this?:
http://blogs.msdn.com/b/joaol/archive/2008/11/20/sql-server-checkpoint-problems.aspx

Edit:

I might as well note here that I'm also getting appdomain unloads and messages indicating long IO requests, but I figured that was already a given based on numbers already posted.


allmhuran.com - download the SSMSDeploy addin for SSMS 2008
Blog on sqlservercentral
Post #1428532
« Prev Topic | Next Topic »

Add to briefcase ««12345»»»

Permissions Expand / Collapse