Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase

system FlushCache happening often Expand / Collapse
Author
Message
Posted Monday, September 24, 2012 10:28 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Wednesday, September 3, 2014 8:25 AM
Points: 163, Visits: 604
SQL 2012 - enterprise... OLTP environment...

Experiencing the following - several times per hour. Has anyone experieced the following:

FlushCache: cleaned up 513008 bufs with 448519 writes in 75641 ms (avoided 92621 new dirty bufs) for db 10:0
average throughput: 52.99 MB/sec, I/O saturation: 104, context switches 5758
last target outstanding: 80800, avgWriteLatency 0

The system is generating the flush cache and causing severe performance degradations.
Post #1363610
Posted Tuesday, September 25, 2012 8:46 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Wednesday, September 3, 2014 8:25 AM
Points: 163, Visits: 604
We tried all sorts of fixes, including contacting Microsoft. Microsoft did not provide any feedback.

Finally - we executed the following in order...

DBCC FREEPROCCACHE

EXEC sp_recompile 'dbo.StoredProcedureName' (we preformed sp_recompile for over 100 Procedures).

DBCC FREEPROCCACHE

Once the final FreeProcCache finished - the server became operational w/out issue.

We'll be scheduling a daily process to execute the FreeProcCache.
Post #1364091
Posted Thursday, March 21, 2013 8:16 AM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Monday, June 23, 2014 4:39 PM
Points: 19, Visits: 65
We are occasionally experiencing the same message - perhaps once a day, is there any advantage to flushing the PROCCACHE more often? As far as I know, we don't run a lot of stored procs on this system. Mostly dynamic SQL
Post #1433810
Posted Thursday, March 21, 2013 11:23 AM


Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: 2 days ago @ 9:39 AM
Points: 1,157, Visits: 3,264
cheshirefox (9/25/2012)
We tried all sorts of fixes, including contacting Microsoft. Microsoft did not provide any feedback.

Finally - we executed the following in order...

DBCC FREEPROCCACHE

EXEC sp_recompile 'dbo.StoredProcedureName' (we preformed sp_recompile for over 100 Procedures).

DBCC FREEPROCCACHE

Once the final FreeProcCache finished - the server became operational w/out issue.

We'll be scheduling a daily process to execute the FreeProcCache.


FYI, CU3 for SQL Server 2012 has been released. It is available via:

http://support.microsoft.com/kb/2812412

There are a number of fixes in this particular CU which address a host of performance issues. Most notably:

SQL Server 2012 experiences performance issues in NUMA environments
http://support.microsoft.com/kb/2819662

FIX: CPU spike when there is no load on a server after you install SQL Server 2012 on the server
http://support.microsoft.com/kb/2813214

FIX: A query that runs against a DMV enters an infinite loop in SQL Server 2008 R2 or in SQL Server 2012
http://support.microsoft.com/kb/2819662

FIX: Poor performance in SQL Server 2012 when you run a SQL Server trace
http://support.microsoft.com/kb/2803529

FIX: Memory leak when you run queries against a temporary table in a nested stored procedure in SQL Server 2012
http://support.microsoft.com/kb/2803065


Tommy

Post #1433930
Posted Wednesday, March 27, 2013 1:03 AM
Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Tuesday, April 1, 2014 3:26 PM
Points: 316, Visits: 1,497
cheshirefox (9/24/2012)
SQL 2012 - enterprise... OLTP environment...

Experiencing the following - several times per hour. Has anyone experieced the following:

FlushCache: cleaned up 513008 bufs with 448519 writes in 75641 ms (avoided 92621 new dirty bufs) for db 10:0
average throughput: 52.99 MB/sec, I/O saturation: 104, context switches 5758
last target outstanding: 80800, avgWriteLatency 0

The system is generating the flush cache and causing severe performance degradations.


FREEPROCCACHE isn't the answer to this issue.


http://blogs.msdn.com/b/psssql/archive/2012/06/01/how-it-works-when-is-the-flushcache-message-added-to-sql-server-error-log.aspx

FlushCache == Checkpoint. This message is logged when a checkpoint takes longer than the recovery interval. It's more of a warning than an actual error.

In your example above, SqlServer wrote 4GB (513008 x 8k) of dirty pages to disk, taking 75.6 seconds in the process. Taking a step backwards to right before the checkpoint started, you had 4GB of dirty data in the buffer pool that hadn't been flushed to the data file(s) (of course it's already in the log file on disk). If a crash occurred right then, when SQL attempts to recover the db, it would have to roll-forward/roll-back the transactions associated with that 4GB of dirty data to bring the db current. Just writing the 4GB of data to disk takes 75 seconds, so the total time to run recovery will very likely be at least 75 seconds. If the recovery interval is 60 seconds, then we're going to take longer to recover the database than we expected.

And that's the problem: You're dirtying data in the buffer pool faster than the checkpoint process can handle within your recovery interval.

Extreme Example: Consider a 2TB database on a server with 512GB of RAM. The logs could be on SSD's, but the data files are on a slow Raid-6 Array. You update 20 GB of data and it takes 20 seconds (fast log writes to the SSD's, yay!). Now there are 20 GB of dirty pages in the buffer pool that need to be written to the slow Raid-6 Array. They are likely to mostly be 8k random writes. If you were able to get 1000 write IOPS out of your Raid 6 array (not likely, but i'm being overly optimistic), then you'd be looking at 40+ minutes to write that data to disk. In other words, if your cluster failed over right after your "20 second" update, it would take over 40 minutes.

If your recovery interval is 60 seconds (default), are you okay with it taking 75 seconds? 120 seconds? 300 seconds? If so, then you can safely ignore the errors.

If you're getting server performance issues at the same time, then keep reading. Checkpoints aren't supposed to cause a negative performance impact to running queries. If they are, then your problem is IO.

Start capturing perfmon counters related to disk IO LogicalDisk(*) and/or PhysicalDisk(*), and any sql counters related to disk io ( reads/sec, checkpoint pages/sec, log flushes/sec, etc.) 1 sec interval is best but no more than 5 sec.

Watch how Current Disk Queue Length, Disk Reads/sec, Disk Writes/sec, Avg Disk Sec/Read, Avg Disk Sec/Write, etc correlate to the sql counters.
What are your baselines?
What happens to your current disk queue length and latency (Avg Disk Sec/Read, Avg Disk sec/write) when a checkpoint occurs?
Using SAN storage? Involve your SAN admin.

To resolve this issue, you have several options:
1. dirty fewer pages (drop extra indexes, use compression, tune queries, etc). Fewer dirty pages means less work each checkpoint.
2. reduce IO load overall (Add memory to reduce reads/sec, move busy tempdb to different drive, tune queries, etc)
3. increase IO write capacity (extra spindles in SAN, add SSD's, switch from Raid-5 to Raid-10, etc)
4. smooth out checkpoint's IO load (set a really high recovery interval and perform manual checkpoints. Don't go here until you've got a really good handle on the perfmon counters above and can prove that this helps.)

Post #1435783
« Prev Topic | Next Topic »

Add to briefcase

Permissions Expand / Collapse