SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Page life expectancy diving to sub-10 on a 128 GB server


Page life expectancy diving to sub-10 on a 128 GB server

Author
Message
Don Halloran
Don Halloran
Ten Centuries
Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)

Group: General Forum Members
Points: 1231 Visits: 1490
Last weekend we migrated our primary ERP system from a Windows 2003, SQL 2008R2 server to a windows 2008R2, SQL 2012 server.

The new servers are much beefier than the old ones, including an extra 32 gig of ram.

The ERP database is ~500 gig, of which a good amount is audit data. On the old server I generally had page life expectancy values around 1000 or more for normal daily activity.

On the new server I have been capturing perfmon stats for the past week and I see the PLE jumping all over the place. It climbs up into the several hundred but then suiddenly dives to ridiculously low values like 12 or even 7.

The server usage profile has not changed, it's still the same users and applications doing the same things they were doing last week.

The instance is set to have min 70 gig, max 110 gig allocated to SQL.
The service account has lock pages in memory permission.

Can anyone think of some setting I might have missed during migraiton, or some new option to be set in SQL2012 that I've overlooked, that could explain this?

Blog on sqlservercentral
Lynn Pettis
Lynn Pettis
SSC Guru
SSC Guru (89K reputation)SSC Guru (89K reputation)SSC Guru (89K reputation)SSC Guru (89K reputation)SSC Guru (89K reputation)SSC Guru (89K reputation)SSC Guru (89K reputation)SSC Guru (89K reputation)

Group: General Forum Members
Points: 89686 Visits: 38933
Well, if we could see what was running on the server when PLE drops we might be able to help you out, unfortunately we can't see it.

One guess would be to ask if you are running CHECKDB on the database at the time PLE drops. Would also help to know if there are any other maintenance routines that might be running then as well.

Cool
Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
Don Halloran
Don Halloran
Ten Centuries
Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)

Group: General Forum Members
Points: 1231 Visits: 1490
Yeh, right now I'm running a server side trace and a few more measures to see if I can match up activity to counters. As I said, though, there's no new usage pattern on the server, no new apps deployed, just all the same jobs, users and apps that were there before. A sudden usage pattern change doesn't seem as likely as me possibly missing some new setting or other.

Blog on sqlservercentral
GilaMonster
GilaMonster
SSC Guru
SSC Guru (210K reputation)SSC Guru (210K reputation)SSC Guru (210K reputation)SSC Guru (210K reputation)SSC Guru (210K reputation)SSC Guru (210K reputation)SSC Guru (210K reputation)SSC Guru (210K reputation)

Group: General Forum Members
Points: 210746 Visits: 46251
No server options that could cause this. Probably CheckDB or index rebuilds that move a huge amount of data through the cache.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass


TheSQLGuru
TheSQLGuru
SSC-Dedicated
SSC-Dedicated (30K reputation)SSC-Dedicated (30K reputation)SSC-Dedicated (30K reputation)SSC-Dedicated (30K reputation)SSC-Dedicated (30K reputation)SSC-Dedicated (30K reputation)SSC-Dedicated (30K reputation)SSC-Dedicated (30K reputation)

Group: General Forum Members
Points: 30349 Visits: 8666
A few guesses (based on very limited information provided and a few answered questions):

1) You have much faster IO now and it is able to "swap out data" in RAM (i.e. read it off disk) much more quickly. This can lead to lower PLE although query performance would still likely be better.

2) Another possibility is that someone messed with the schema and dropped one or more important indexes from large tables, leading to scans instead of seeks.

3) Oh, what about MAXDOP server setting?? If you had it set to 1 (not uncommon in the ERP world) and now it is back to default of 0, you would be parallelizing potentially many more queries, which would be consuming data more quickly.

4) Yet another is that you simply did monitoring while some very large reports were being run for some muckety-muck VP. :-)

5) It used to be that upgrading editions meant having to do a full-scan statistics update on all indexes/stats. But I don't think that is required when going from SQL 2008R2 to 2012.

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
HowardW
HowardW
SSCertifiable
SSCertifiable (5.9K reputation)SSCertifiable (5.9K reputation)SSCertifiable (5.9K reputation)SSCertifiable (5.9K reputation)SSCertifiable (5.9K reputation)SSCertifiable (5.9K reputation)SSCertifiable (5.9K reputation)SSCertifiable (5.9K reputation)

Group: General Forum Members
Points: 5857 Visits: 9892
It could be that the old server had a different number of system buses (NUMA nodes) and this is affecting PLE. I assume you're looking at the Buffer Manager: PLE counter? Given the specs you've said above, you must have multiple NUMA nodes and this counter is the average across all of these. Buffer Node: PLE gives the stats for individual pools.

Threads can access foreign memory to their local NUMA node, but there's a preference for locality, which I'm assuming would end up recycling individual (smaller) pools faster than one big one, therefore have a lower average PLE. In some situations, that could be worse for overall system performance, but generally not.

What are you seeing in terms of overall system performance, e.g. wait stats, average transaction times etc.?
Don Halloran
Don Halloran
Ten Centuries
Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)

Group: General Forum Members
Points: 1231 Visits: 1490
Thanks for the responses. I've been looking at this for a few days on and off. Today I've been sitting here watching the counters because I am seeing such weird, weird output.

Here are some numbers from right this instant:

Resource Monitor\Memory (Instance allocation is min 70,000 max 100,000)
In use: 105200
Standby: 25000
Free: 70

Standby and free look weird, I would expect 15 gig less in standby and 15 gig more in free.

Perfmon:
Buffer ManagerTongueLE = 12
% Disk Time (data) = 92,000 (!!!!)
Avg Queue Length (data) = 920
Disk Reads / sec (data) = 6000 (!!!!)
Disk Writes / sec (data) = 55

% Disk Time (log) = --
Avg Queue Length (log) = 0.02
Disk Reads / sec (log) = 0
Disk Write / sec (log) = 2

% Processor Time = 14.5

Instantaneous output from sp_whoisactive at the same time shows no signficant reads from query activity:

reads writes physical_reads
------ ------ --------------
54 0 64
29 0 96
6 0 0
75,567 0 28,832
4 0 0
4 0 0
4 0 0
4 0 0
4 0 0
5 0 0
40 0 14
18 0 0
5 0 0
5 0 0
4 0 0
NULL NULL NULL
13 0 16
5 0 0
5 0 0
NULL NULL NULL
NULL NULL NULL
56,959 5 56,780
NULL NULL NULL
NULL NULL NULL
NULL NULL NULL
69,398 36 3,417
1,452 0 1,722
NULL NULL NULL
NULL NULL NULL
NULL NULL NULL
120 3 16
NULL NULL NULL
NULL NULL NULL
4 0 0
1,756 8 304
NULL NULL NULL
7 0 7
NULL NULL NULL
NULL NULL NULL
NULL NULL NULL
207 2 23
NULL NULL NULL
90 0 0
NULL NULL NULL
8 0 0
NULL NULL NULL
NULL NULL NULL
NULL NULL NULL
27 0 0
98 0 0
NULL NULL NULL
8 0 0
NULL NULL NULL
NULL NULL NULL
101 3 92
224 6 16
NULL NULL NULL
NULL NULL NULL
NULL NULL NULL
NULL NULL NULL
NULL NULL NULL
NULL NULL NULL
NULL NULL NULL
NULL NULL NULL



sys.dm_os_wait_stats
CXPACKET is at the top (15%), followed by...
SP_SERVER_DIAGNOSTICS_SLEEP
XE_TIMER_EVENT
HADR_FILESTREAM_IOMGR_IOCOMPLETION
DIRTY_PAGE_POLL
BROKER_EVENTHANDLER
XE_DISPATCHER_WAIT
LOGMGR_QUEUE

PAGEIOLATCH_SH is down in 12th place (4.3%)

NUMA
Unfortunately I know virtually nothing about NUMA yet. I guess this is me learning about it, heh.

I see 5 nodes in sys.dm_os_nodes, 2 "ONLINE", 1 "ONLINE DAC", 2 "OFFLINE".

In sys.dm_os_memory_nodes I see:

memory_node_id virtual_address_space_reserved_kb virtual_address_space_committed_kb locked_page_allocations_kb pages_kb shared_memory_reserved_kb shared_memory_committed_kb cpu_affinity_mask online_scheduler_mask processor_group foreign_committed_kb
-------------- --------------------------------- ---------------------------------- -------------------------- -------------------- ------------------------- -------------------------- -------------------- --------------------- --------------- --------------------
0 203592464 2183796 48977664 10796224 1920 1920 1048575 1048575 0 21068516
1 35712 35732 2784 2784 0 0 1048575 0 1 0
2 35712 35732 2784 2784 0 0 1099510579200 0 1 0
3 32768 32788 51128584 11795384 0 0 1099510579200 1099510579200 0 23373076
64 0 20 0 10797256 0 0 1048575 1048575 0 0




It looks really weird to me. Query activity is minimal but disk activity is thorugh the roof and PLE is collapsing at semi-regular intervals. Any interpretations would be most welcome.

Blog on sqlservercentral
SQLRNNR
SQLRNNR
SSC Guru
SSC Guru (63K reputation)SSC Guru (63K reputation)SSC Guru (63K reputation)SSC Guru (63K reputation)SSC Guru (63K reputation)SSC Guru (63K reputation)SSC Guru (63K reputation)SSC Guru (63K reputation)

Group: General Forum Members
Points: 63401 Visits: 18570
Just curious about a couple of BIOS settings.

1. What is the hardware reserve memory set at for the server? This could be impacting your "standby" memory count.
2. Are the energy saving features enabled for this server? I know you are looking at memory and disk. But I am curious if it is being caused by a CPU that is being taxed due to the energy saving features being enabled.



Jason AKA CirqueDeSQLeil
I have given a name to my pain...
MCM SQL Server, MVP


SQL RNNR

Posting Performance Based Questions - Gail Shaw

Don Halloran
Don Halloran
Ten Centuries
Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)

Group: General Forum Members
Points: 1231 Visits: 1490
Hardware reserved set at 64
CPU use is way down at 5 - 15%.

Blog on sqlservercentral
SQLRNNR
SQLRNNR
SSC Guru
SSC Guru (63K reputation)SSC Guru (63K reputation)SSC Guru (63K reputation)SSC Guru (63K reputation)SSC Guru (63K reputation)SSC Guru (63K reputation)SSC Guru (63K reputation)SSC Guru (63K reputation)

Group: General Forum Members
Points: 63401 Visits: 18570
Hardware reserved, is that in MB or GB that you are showing?

NUMA - did you make any changes to CPU affinity or is the default setting still active (SQL Server managing)?

HowardW asked a question concerning Buffer Node v Buffer Manager in regards to which PLE you are monitoring. Which one is reporting the erratic stats?



Jason AKA CirqueDeSQLeil
I have given a name to my pain...
MCM SQL Server, MVP


SQL RNNR

Posting Performance Based Questions - Gail Shaw

Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search