I do very little consulting these days, perhaps 4-5 weeks per year, and that only to make sure I maintain 'real world' skills that I need to stay credible in the classroom. Much of that consulting is basic performance reviews of existing code, and today I've got a couple notes about things I've noticed over the past few years:
I learn something every time I do these engagements, and actively look for those aha moments.Much of is mundane, getting rid of scans and key lookups. More fun - but harder - is digging through complex processes (especially ETL) to see where you can make a nice gain. A day of it is fun, a week is exhausting.
Good stuff to think about if you're bringing in a consultant, make sure you know and they know your expectations.
One of the things I teach in our admin course is that it's not enough to just be the gatekeeper/central point of change when it comes to being a DBA, we should be adding value:
You'll notice that I don't put in the list verifying the change is correct - that's a QA task!
One sub bullet point I've added to 'is it stupid' recently is checking queries for unused tables. I ran across a couple instances lately of what was probably copy/paste code reuse and I was tuning the queries, realized that several tables were included that had no effect on the results. Easy to miss something when you see things like this so it definitely needs to go back for testing, but it's worth taking the deeper look right then, before you deploy.
I was visiting with a client recently when they asked me to come take a look at an incident in progress, the number of connections to the server had just about doubled over what they were normally. As far as troubleshooting this scenario I always start by checking for blocking. If you've got spids blocked they can't go back into the connection pool, forcing the connection pool manager to open up new threads.
No blocking? The next step is to run two different group bys against sysprocesses, one by hostname, the other by program name. I'm looking for a clue that will show all the connections are being generated from one machine or one application, or both! In this case we could see that the connections were evenly distributed across the web farm, but trying to break down by application wouldn't work because they didn't set the application name in their connection string. Fixing this is easy, just add the following to the current connection string:
;Application Name=My App v1.1.1
We couldn't identify anything obviously wrong on the server, in the error/event log, or in their application logging. Within a few minutes we saw a nice reduction in connections and soon things were back to normal. Would having the application name have made a difference? Unknown to unlikely in this case, but given that it's so easy to do, why not make the change so that next time you have one more piece of data?
With all the options we have for viewing query plans I still prefer the graphical view. Combined with tooltips and the occasional journey to the properties dialog it solves the problems I work on. The only downside is that the plan view tends to be verbose and the zoom options just don't work well. Resize to much smaller than the default and you can barely read them, leaving the 'fly over' mode (using the "+" button in the bottom right corner) to navigate the plan looking for whatever is to be found. It's not unworkable, but it does get in the way of a holistic view of what happens when. In other words, we have to divert a portion of our attention from understanding the problem to just using the tools.
Given todays prices there is a simple alternative, just upgrade your primary monitor to 24". No, that won't handle every plan, but it make most of them a lot easier to work with. At probably $400 it's not cheap, but doable. I've seen 22" wide screens recently for under $200.
I do tuning with two monitors, writing/examing queries on the left, Profiler on the right. Alt-Tabbing around is another distraction when I'm problem solving. Using Profiler let's me see easily the exact cost as well as the history of what I've tried/changed so far. I mention that because while using monitors of two different sizes is better than using two monitors, it's also a little visually distracting. The best case is having monitors that are exactly the same.
So there you go fellow DBA's, a starting point for the budget request for next year. Justifying tools is never easy compared to 'solutions', but monitors is an easier sell than some. Struggling to get the monitors approved? Suggest that perhaps your boss, or the CFO/CEO give dual 24" monitors a try to see if there any benefit. Shortly after that monitor sales will go up, everyone in sales will have two monitors, and if you're really lucky, you'll get them too!
Rebuilding Stats was published yesterday on SSC, some nice comments posted to it as well. The main point of the article was that if you're rebuilding indexes with the default options you're automatically getting stats update on those columns as well.
I'll be teaching a one day performance tuning seminar that targets beginning DBA's and developers that do data access, hoping to provide some useful insight into things that they can do immediately to address performance problems. It's on May 2, 2008 in Jacksonville and will serve as the 'pre-conference' event for the Jacksonville SQLSaturday on May 3rd. It's going to be interesting to see if I can get DBA's & developers talking the same language!
I wrote this article on computed columns to support some of the teaching I do for my beginner to intermediate level performance tuning class. It's not a technique you'll apply that often, but I think it's used less often than perhaps it should be. Hope you'll take a look and add a comment to the article.
Part looks at using views for partitioning, still a valid technique even in the SQL 2005 world. I added in some basic performance tuning information as well, looking at the query plan when partitioning is done correctly vs incorrectly. The final installment has been submitted and will cover the SQL 2005 implementation of partitioning.