Speaking of performance... the following is admittedly a bit long but well worth the read (IMHO).
At one company I worked at, we had a 500GB set of databases that made up a system and it was rapidly growing every day. Many jobs took 8 hours to run. One job, in particular, took 24 hours to usually fail. Because it took so long to run and took multiple reruns to get it to run to completion, they were only having it do 2/3 of the work it was supposed to do.
To fix these performance problems, they bought a new fire-breathing monster of a server, a whole new high performance SAN, and spent a whole lot of money changing from the Standard Edition of SQL Server to the Enterprise Edition. Of course, there was the cost of migration, testing, ancillary equipment such as routers/switches, etc, etc.
The 8 hour jobs dropped to about 7 hours for about a month and then went back to 8 hours. The 24 hour job still failed a lot and still took 24 hours for the first failure to usually show up.
Since the 24 hour job was for PUC compliance, we could get fined for every day that we went past the deadline. As a result, every time the run failed, a half dozen managers, several "experts" in Finance, Accounting, and a couple of other departments, several developers, and a couple of DBAs were all brought together to try and fix the problem. You can imagine how much that cost per hour. The funny thing is, they never fixed it. Every month, the same group of "experts" was summoned and every month, they'd watch it fail over and over and over until, by some small miracle, it would finally run to completion. Even then, they sometimes had to rerun it because when the run failed, it would leave some pretty rotten apples on the trail and people would sometimes miss that bad data before they attempted a rerun.
The reason why it would fail was just because it took so long to run and something would deadlock with it. The reason why it took so long to run was because of the code itself. To be brief and brutal, it was crap code written by people who didn't understand how to work in a database.
People thought that it was doomed to continue to take that long. After all, it was checking 63 four million row tables scattered across 63 different dynamically named databases for duplicate CDRs (Call Detail Records). The consensus was that no one would ever be able to do anything to make the job run faster especially since we just installed the new fire breathing hardware and the vendor that wrote the code insisted that no one could handle that much data in less than 24 hours. They were "experts". Everyone believed them.
That wasn't all of it, either. The code was also executed on a sub-set of 3 databases per day. The code would fail every other day for the previous reasons stated. They tolerated it because each daily run "only" took 45 minutes to run to completion. Besides, the experts said we were lucky it didn't fail more often and that they were surprised that it "only" took 45 minutes considering the amount of data we had (running o 32 bit machines at the time, to boot!).
Even though I offered to take a look at it, I was told to keep my hands off the code because the "expert" vendor had already stated that it was just because of the "overwhelming amount of data" and that it would be a total waste of time to look at it. If you know me, then you'll also know how much attention I actually paid to that. ;-)
With the help of the Director of Finance (he "owed" the code), who wrote a dandy 2 paragraph requirements document, I wrote the code at home because I'd been told I wasn't allowed to work on it. The Director of Finance understood my problem and had his people test my code instead of using official "QA" channels.
The first time they tested the daily run, I got an urgent email saying that "it didn't run correctly". When I asked them why, they said "it ran for 7 seconds and then quit". I laughed out loud and told them to check the data. They found out that their 45-minute daily job was now correctly running in less time than it takes to yawn.
The monthly job also did its trick correctly. The 24-hour job now ran in 11 minutes flat. It also did the full 94 databases in that time instead of the usual 63.
I also got my hands on some of the 8-hour jobs and made them run, quite literally, "in seconds".
Alright... sounds like a big brag on my part. That's not my intention. What I want you to understand is that, all told, they spent a bit more than a quarter of a million dollars on new hardware, the Enterprise Edition of SQL Server, the manpower to set it all up with new switches, routers, cabling, etc, etc, the manpower to migrate the databases, synchronize them once the bulk of them had been moved, and to test them. And it did squat for performance.
Along comes someone with just a bit of database knowledge (I'd only been working with SQL Server for a handful of years) and he did the "impossible". He converted a 24 hour "impossible to improve" job to an 11 minute run that did 50% more work and it hasn't erred in 6 years AND he did it in about 20 hours (4 hours a night for a week).
So here's the takeaway on this. Yes... buy good fast hardware. In any race, you need a really good car (the server), some really good tires (the SAN), and some really good gas (the Enterprise Edition of SQL Server). BUT!... unless you have a driver that can keep that high performance car on the track, that car isn't any better than a skateboard. If you're going to invest in some hardware, then you really need to invest in someone that knows how to drive it. Even if you buy machines 10 times as fast (and those arn't actually available, yet), where are you going to find a box that will take a run from 24 hours to 11 minutes? It's just not going to happen.
Spend the money on hiring a full time "Ninja" level database Developer or two and then listen to him/them... or leave the car in the garage. Performance is in the code. ;-)
is pronounced ree-bar and is a Modenism for R
First step towards the paradigm shift of writing Set Based code: Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
Although they tell us that they want it real bad, our primary goal is to ensure that we dont actually give it to them that way.
Although change is inevitable, change for the better is not.
Just because you can do something in PowerShell, doesnt mean you should. Helpful Links:
How to post code problemsHow to post performance problemsForum FAQs