Click here to monitor SSC
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Speeding up database access Part 2 - Pinpointing Other Bottlenecks


Speeding up database access Part 2 - Pinpointing Other Bottlenecks

Author
Message
mperdeck
mperdeck
Forum Newbie
Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)

Group: General Forum Members
Points: 4 Visits: 37
Comments posted to this topic are about the item Speeding up database access Part 2 - Pinpointing Other Bottlenecks
Jonathan AC Roberts
Jonathan AC Roberts
SSC-Addicted
SSC-Addicted (480 reputation)SSC-Addicted (480 reputation)SSC-Addicted (480 reputation)SSC-Addicted (480 reputation)SSC-Addicted (480 reputation)SSC-Addicted (480 reputation)SSC-Addicted (480 reputation)SSC-Addicted (480 reputation)

Group: General Forum Members
Points: 480 Visits: 1846
In the Fragmentation section when it instructs to:
DECLARE @DatabaseName sysname
SET @DatabaseName = 'mydatabase' --use your own database name


The uses it:
 FROM sys.dm_db_index_physical_stats(DB_ID(@DatabaseName), NULL, NULL, NULL, 'Sampled') ips


The query only actually works from the current database so there is no need to "DECLARE @DatabaseName sysname" just use DB_ID() with no parameter:
 FROM sys.dm_db_index_physical_stats(DB_ID(), NULL, NULL, NULL, 'Sampled') ips


Jonathan AC Roberts
Jonathan AC Roberts
SSC-Addicted
SSC-Addicted (480 reputation)SSC-Addicted (480 reputation)SSC-Addicted (480 reputation)SSC-Addicted (480 reputation)SSC-Addicted (480 reputation)SSC-Addicted (480 reputation)SSC-Addicted (480 reputation)SSC-Addicted (480 reputation)

Group: General Forum Members
Points: 480 Visits: 1846
The Disk Usage section is not relevant to SAN disks. We have systems that run at over 100% disk time all day. I think this is because a SAN has multiple disks and perfmon add the percentages together so you can quite happily have a system running with a disk time of 150% all day. It is more important to look at the average and current queue lengths.
steven.malone
steven.malone
SSC-Enthusiastic
SSC-Enthusiastic (118 reputation)SSC-Enthusiastic (118 reputation)SSC-Enthusiastic (118 reputation)SSC-Enthusiastic (118 reputation)SSC-Enthusiastic (118 reputation)SSC-Enthusiastic (118 reputation)SSC-Enthusiastic (118 reputation)SSC-Enthusiastic (118 reputation)

Group: General Forum Members
Points: 118 Visits: 227
Disk usage
SQL Server is heavily disk bound, so solving disk bottlenecks can make a big difference. If you found memory shortages in the previous section, fix those first, because a memory shortage can lead to excessive disk usage in itself. Otherwise check these counters to see if there is a disk bottleneck for some other reason.

Categories: PhysicalDisk and LogicalDisk

% Disk Time - Percentage of elapsed time that the selected disk was busy reading or writing.
Avg. Disk Queue Length - Average number of read and write requests queued during the sample interval.
Current Disk Queue Length - Current number of requests queued.
If % Disk Time is consistently over 85%, the disk system is stressed. Avg. Disk Queue Length and Current Disk Queue Length refer to the number of tasks that are queued at the disk controller or are being processed. You want to see a counter value of 2 or less. If you use a RAID array where the controller is attached to several disks, you want to see counter values of 2 times the number of individual disks or less.

Part 8 will show how to fix disk issues.


It is very important to look at each PhysicalDisk individually, otherwise you get an average.
In my opinion the numbers above indicate a severely stress disk subsystem.
At half the above levels the disks are probably affecting performance.

The Disk Usage section is not relevant to SAN disks

Actually it is. First see above "look at each disk individually."
If all you can get from your SAN is over 100% complain to the vendor to fix their driver.

More important is the fact that a SAN is just another way of connecting disks to the computer. It does not make any one disk spin faster nor the heads settle down to a track any faster nor the performance advantages and disadvantages of the various RAID configurations. The only performance attribute the SAN brings to the table is higher total bandwidth. The rest of a SANs advantages are administrative. Configuring a three disk RAID 5 for the primary data is equally bad on a SAN, a high end RAID controller or JBOD.
Having the transaction log file on dedicated spindles (eg dedicated heads) is still as important.
Paul Randal
Paul Randal
SSCrazy
SSCrazy (2.6K reputation)SSCrazy (2.6K reputation)SSCrazy (2.6K reputation)SSCrazy (2.6K reputation)SSCrazy (2.6K reputation)SSCrazy (2.6K reputation)SSCrazy (2.6K reputation)SSCrazy (2.6K reputation)

Group: General Forum Members
Points: 2563 Visits: 1715
Sigh - another article mentioning 300 as a threshold for Page Life Expectancy. Go read this article that explains why 300 is nonsense.

Also, low PLE or BCHR does NOT mean SQL doesn't have enough memory - it means something is causing the buffer pool to thrash - rarely is it a memory issue. It's usually some change in what SQL Server is doing - e.g. large parallel table scans instead of index seeks because of a bad plan.

Looking at fragmentation for tables with less than a few thousand pages is usually a waste of time, and your statement that fragmentation levels < 20 aren't worth dealing with is quite wrong. What if the fragmentation is focused on the 'in use' portion of a table or index? You should use the numbers from Books Online (which I wrote). Your query is also broken because you're not filtering on allocation unit type - all indexes/tables with LOB or ROW_OVERFLOW data will show up as fragmented. Fragment_count is a pretty useless number we made up to try to make fragmentation easier to understand - you should ignore it.

Thanks

Paul Randal
CEO, SQLskills.com: Check out SQLskills online training!
Blog:www.SQLskills.com/blogs/paul Twitter: @PaulRandal
SQL MVP, Microsoft RD, Contributing Editor of TechNet Magazine
Author of DBCC CHECKDB/repair (and other Storage Engine) code of SQL Server 2005
GilaMonster
GilaMonster
SSC Guru
SSC Guru (58K reputation)SSC Guru (58K reputation)SSC Guru (58K reputation)SSC Guru (58K reputation)SSC Guru (58K reputation)SSC Guru (58K reputation)SSC Guru (58K reputation)SSC Guru (58K reputation)

Group: General Forum Members
Points: 58915 Visits: 44719
This piece is just wrong.

If % Disk Time is consistently over 85%, the disk system is stressed. Avg. Disk Queue Length and Current Disk Queue Length refer to the number of tasks that are queued at the disk controller or are being processed. You want to see a counter value of 2 or less. If you use a RAID array where the controller is attached to several disks, you want to see counter values of 2 times the number of individual disks or less.


Firstly, with the number of layers between SQL and the disks with SAN or similar, it's near-impossible to diagnose a queue length as good or bad.

The second reason is that there are operations in SQL that intentionally drive the queue length high, that submit large numbers of IOs and then carry on working while they wait.

Queue length just isn't a useful metric these days.

The problem with disk time is that it is not a value between 1 and 100. It can go to several times 100, depending on what's beneath that physical drive. A value of 500 may be perfectly fine.

Rather stick with the latencies (Avg sec/read and Avg sec/write) for diagnosing disk-related bottlenecks.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass


GilaMonster
GilaMonster
SSC Guru
SSC Guru (58K reputation)SSC Guru (58K reputation)SSC Guru (58K reputation)SSC Guru (58K reputation)SSC Guru (58K reputation)SSC Guru (58K reputation)SSC Guru (58K reputation)SSC Guru (58K reputation)

Group: General Forum Members
Points: 58915 Visits: 44719
On the compiles and recompiles...

These counters will show high values at server start up as every incoming query needs to be compiled. The plan cache sits in memory, so doesn't survive a restart. During normal operation, you would expect compilations per second to be less than 100, and re-compilation per second to be close to zero.


It depends, really. If I've got a system with very volatile data, where the statistics are changing fast, I want to see relatively high recompiles/sec as it means I'm getting execution plans optimal for the current state of the data. SQL recompiles for two reasons - accuracy and performance. Accuracy meaning that something in the schema has changed, performance meaning the stats have changed. If the stats have changed significantly I'd rather a recompile than a non-optimal execution plan being reused.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass


chris.puncher
chris.puncher
SSC Rookie
SSC Rookie (42 reputation)SSC Rookie (42 reputation)SSC Rookie (42 reputation)SSC Rookie (42 reputation)SSC Rookie (42 reputation)SSC Rookie (42 reputation)SSC Rookie (42 reputation)SSC Rookie (42 reputation)

Group: General Forum Members
Points: 42 Visits: 416
Hi Gail and Paul,

I've read several articles like this one, and they are frequently followed by posts saying that this metric or another is
a) Pointless
b) Out of date
c) Just SQL folklore based on something Paul once said ;-)
d) All of the above

Are there any metrics that can be trusted in Perfmon for diagnosing system performance problems? Do either of you know of, or have you written similar articles that are available online?

Thanks,

Chris
GilaMonster
GilaMonster
SSC Guru
SSC Guru (58K reputation)SSC Guru (58K reputation)SSC Guru (58K reputation)SSC Guru (58K reputation)SSC Guru (58K reputation)SSC Guru (58K reputation)SSC Guru (58K reputation)SSC Guru (58K reputation)

Group: General Forum Members
Points: 58915 Visits: 44719
This is a fantastic discussion, slightly old but not seriously so:
http://runasradio.com/default.aspx?showNum=81 (30 min or so podcast)

Many of the metrics can be trusted and used, not in terms of hard thresholds, but in terms of what's normal. For example's Paul's example of PLE, say my production server usually has during business hours a PLE between 10000 and 15000, I check perfmon today and find that it's sitting at 4000, that is a cause for concern. Not because it's below some magic number, but because it's far from what's normal for my system.

p.s. The PLE threshold at 300 was relatively sensible 10 or so years ago when servers had small amounts of memory. These days, not so much. A PLE of 300 means your entire buffer pool is getting discarded and fetched from disk in 5 minutes. On servers that can easily have > 64GB of memory, that's not great.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass


pkrudysz
pkrudysz
Valued Member
Valued Member (64 reputation)Valued Member (64 reputation)Valued Member (64 reputation)Valued Member (64 reputation)Valued Member (64 reputation)Valued Member (64 reputation)Valued Member (64 reputation)Valued Member (64 reputation)

Group: General Forum Members
Points: 64 Visits: 267
I totally disagree with people saying that PLE (Page Life Expectancy) of 300 is an irrelevant number and how we should all watch our servers and look for deviations from usual PLE. BS! If your normal PLE is in 10-15 thousand range and it drops below 4000, then no, it's NOT a problem! The server is still performing well. While if it drops below 300 - you have a memory issue regardless of what your usual PLE is. I'm not saying 299 is bad, and 301 is ok, but if you're somewhere below that range, your server bogs down considerably. w00tw00t
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search