Performance Tuning Through Panic

Question

Performance Tuning Through Panic

Steve Jones - SSC Editor

SSC Guru

Points: 734418
More actions
February 8, 2014 at 12:28 pm

#148142

Comments posted to this topic are about the item Performance Tuning Through Panic

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply

SQLRNNR SSC Guru Points: 281334 More actions · Answer 1

Until clients and vendor are willing to tune their code, many problems will only be masked by hardware.
For a time. Then they'll reappear.

Getting clients and vendors to tune code is like a double-edged sword. If you get them to do it, then you need to find a new client. If they don't, then you have plenty of work but sometimes the problems keep coming back and they become skeptical. Ultimately we want the code to be tuned and working better.

Until then...

Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events

Jeff Moden SSC Guru Points: 1003853 More actions · Answer 2

I like this editorial and Paul's sort article on this a lot but, most of the time, I've found it to be just crap code and crappy table design.

Heh... I guess the term RBAR has made the big time if Paul is using it. 😛

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Gary Varga SSC Guru Points: 82166 More actions · Answer 3

Another key issue is "perceived performance". I have worked on a number of systems when the performance has been drastically improved over the original system, however, the users disagree. This has at times lead to management support for the users belief that the performance is worse.

Whilst I do agree with management support for users opinions, it is important for IT professionals to fix real problems and resolve perceived ones. For performance this is achieved through measurement. Just by sticking to the facts you can be deemed as being helpful. It brings clarity to the situation e.g. showing that the users' claim that it is taking twice as long is inaccurate but accepting that it is 20% slower and agreeing (before being asked) that this is unacceptable is helpful.

I guess the key thing is to try and be the voice of reason and take the emotions out of the situation. Publicly support the possibility of the perceptions then prove or disprove them. Don't bother counter the claims but lead with the results from measurement. If there is a performance improvement but it is being reported as a fall then offer the facts and ask if there may be another reason.

I guess it boils down to what Douglas Adams told us all: Don't panic!!!

Gaz

-- Stop your grinnin' and drop your linen...they're everywhere!!!

Robert.Sterbal SSCrazy Points: 2855 More actions · Answer 4

Admittedly baselining a system you are about to replace is tedious, but might be worth the effort in addressing the user feedback.

Jeff Moden SSC Guru Points: 1003853 More actions · Answer 5

Gary Varga (2/10/2014)
Another key issue is "perceived performance". I have worked on a number of systems when the performance has been drastically improved over the original system, however, the users disagree. This has at times lead to management support for the users belief that the performance is worse.

+1000 on that. Been burned by the very perception before. Just the mere mention of any change can get the users watching for slowness and because they're concentrating on that, it actually does seem slower to them. That's why part of my baselining for such things include a trip out to user-ville with a stopwatch and a list of screens to be tested.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jim P. SSCrazy Eights Points: 8725 More actions · Answer 6

One that I ran into was testing the EOM processing compared to daily production during a SW upgrade.

We took a copy of the Accounting DB and put it on the test server on the twenty-nineth of the month. Ran the upgrade. Then we ran production data to the thirty-first into both systems. The accounting staff then ran the EOM reports on both and did a comparison. The production was like a quad processor 24 GB RAM on a full up iSCSI RAID system on a clustered server SQL 2005 server. The test server was an older dual core 8GB RAM with local drives that were barely setup as a basic RAID 5. The clients were also older machines.

So the expectation was the that the test system reporting would be slower than production. And it was proved out, but the accounting numbers matched so we scheduled the upgrade in mid-month on a light week(end).

We then hit the next EOM on production. They report the EOM processing is dead slow. We go through and investigate. Apparently the SW vendor went through and put in "enhancements" so big customers can run cubes and some other advanced reports. It also dumped more data into more tables that most customers didn't need. About 500 added lines of code in the sprocs and such. Since it was a new version (x.0) there was no way to turn off the extra code in the interface. Version x.1 was out in a few months that allowed the configuration to turn the advanced code on or off as needed.

So after that we made a point to run the data into the test system. Take a back up. Do the EOM processing in the current version on the test system. Restore the DB and then upgrade and redo the EOM on the test. And then compare the numbers purely of the test match. And still matched production. And the times as well. I grant it was a royal PITA, but saved us in heartache down the road.

----------------
Jim P.

A little bit of this and a little byte of that can cause bloatware.

Nadrek SSC-Insane Points: 20039 More actions · Answer 7

The first key to performance tuning: Don't panic.

The second key, as mentioned in the article, is to have enough knowledge to be able to do very fast root cause determination that's still accurate.

Personally, I usually go in more or less this order (varying based on domain knowledge/gut instinct/tacit knowledge), but rarely use all the steps:

Don't panic.

If you're panicking, take a deep breath, hold for a count of five, release. Repeat until you are no longer panicking.

Check with actual users to verify actual symptoms... and how it's different from before!

Check overall server stats

Windows 2008 R2 or better Resource Monitor

Disk latency and throughput, CPU, memory, network throughput

Task Manager

SSMS Activity Monitor

Adam Machanic's Who Is Active

Perfmon (you can save an entire setup, and I love report mode instead of graphs)

Your company's performance reporting software/queries

Wait states!

Deadlock counters!

Disk and network error counters!

Recent configuration changes (MaxDOP, cost threshold of parallelism,

Check with hardware/networking/SAN/etc. teams about global problems or recent changes

Check with software owners/developers about whatever area the users show symptoms in

Open up Profiler and take a look, both globally (briefly) and narrowly

SQL:BatchCompleted and RPC:Completed

You need to have a good feel (baseline) for what's out of place.

Check your server's management software for issues (disk failure, RAM failure, overheating, etc.)

Specalized: Check on your Hypervisor

Specialized: Open up Wireshark and watch the network traffic

Specialized: Check on your SAN administration screens

Specialized: Check on your network traffic, IDS/IPS appliances, throughput rate benchmarking, DNS, etc.

When you're finished, wash your towel.

Note that the first few steps relate to requirements gathering/problem identification. Permanent fixes are better in the long run than bandaids every N days/weeks, and permanent fixes require root causes be accurately determined.

Luis Cazares SSC Guru Points: 183698 More actions · Answer 8

I believe that you need to add something to your steps.

If you're panicking, take a deep breath, hold for a count of five, release. Repeat until you are no longer panicking. (Repeat between steps if needed)

Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

How to post data/code on a forum to get the best help: Option 1 / Option 2