Short Stroking and Performance

  • Comments posted to this topic are about the item Short Stroking and Performance

  • I think hard disks have simply gotten too big to do some of the old tricks to bring multiple spindles into play. As for "short stroking", Peter Norton figured this out eons ago and his disk utilities would move the more frequently used files to the outer tracks and the less frequently used files to the inner tracks. The problem with that, now, is that everyone thinks disks on a SAN take care of themselves and defragment themselves, etc, etc, ad infinitum. RAID arrays help a bit but a lot of folks simply don't take the time to set them up right... if they can be manually setup at all.

    Shifting gears a bit, this is yet another "hardware tuning effort" and, while I agree that disks are the slowest element in any computer and do need to be tuned (would't it be wonderful if log files could be constrained to live on the outer tracks?), people shouldn't expect it to be a performance panacea. Yep... you can get 50% or maybe even a 100% improvement in code speed by tuning your disks (most companies have fixed the dreaded sector offset problem) but it all pales in comparison to writing good code where performance increases of more than 3,000% (30 times faster) and, sometimes, 1,000's of times faster (+100,000%) can be realized with just a little thoughtful effort even in the face of bad database designs.

    An example of this (and some have heard this particular example from me a couple of times), we had 2 processes at an old job I worked at. The first process was a daily process that would do a dupe check across 3 daily databases (1 for a given day in each of 3 months) each containing a 4 million row CDR (Call Detail Record) table and 1 "in-process" table which usually contained several thousand rows. The dupe check checked against inbound number, outbound number, and time of day and that was IT! That daily process took 45 minutes to run and sometimes fail. The process that I wrote did it in 17 seconds meaning that it ran 157.82 times faster or was a 15,782% performance improvement. Try getting that with hardware tuning or even with the latest and greatest hardware.

    The monthly process did dupe checks across the same databases except there was 1 for each day in a full 3 months. Basically, there were 93 databases (3*31) and the "in-process" table. The orginal code ran so slow (usually about 24 hours to usually fail the first time, so at least 1 rerun was needed). And, actually, it ran so slow that they throttled it back to only check 2 months worth of database (62 instead of 93 four million row tables) so they were only doing 2/3rds of the required job. If we extrapolate, that's 24/2*3 or 36 hours that it would have taken if they used all 3 months instead of just 2. The process that I wrote did 3 months in 45 minutes. That's 47 times or 4,700% faster and, to the best of my knowledge, it hasn't failed since when I wrote it back in 2006.

    Prior to me even finding out about that mess, they upgraded from SQL Server 2000 SE to SQL Server 2000 EE (yeah... they were quite a way's behind), migrated from a 4 processor box to a 16 processor box with a whole lot more memory, and changed from DAS to a killer (at the time, still love it) EMC Clarion SAN, and all they got was about a 20% improvement, which brought them up to the levels I saw when I first looked at the problem.

    So, yeah... I agree... hardware and hardware tuning are important but, if you really want to make an improvement, remember what most people won't even consider because hardware is supposedly cheaper than talent (and, in the long run, it's not really)... "Performance is in the code".

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • My prior company was an end user of SW (a bank that used delivered SW for our major apps). When I walked in the door we had about ten servers in the HQ and the rest were at every branch. Some at the HQ were just straight file servers. There was a combo of Oracle and SQL. Ten years later when I left we had about fifteen-twenty servers at the HQ. We had clusters and had bought some SAN units.

    Back then it was 70GB or 140GB SCSI physical drives that we could line up. The servers were bootin off their internal hard drives, but all the data drives were on the SAN. So I did all the research, but I couldn't come up with a way to justify a 2 (or 3) x 70GB tied to a TempDB and then the other drives being tied up for data and the log. I was lucky enough to get the D: and L: to split the mdf from the ldf. But we got decent performance. And bad performance we complained to the SW devs.

    So then I get to my new company which was a SW developer. The DB design was bad with GUIDs and replication. We hosted a bunch of our clients, but no one had managed the hosted VM environments. So I was able to change performance from bad to abysmal to an acceptable level just from tuning. But they never sold enough copies of the product to be profitable so they shut production down.

    They shifted me to another product I hadn't touched which was a web interface client. They send me out for training and the replacement DBA, not the original DBA designer, shows me this beautiful map of the the T: is the temp DBs, the S: is the local SW. The D: is the data, the L: is the log files. I mean this is the dream setup that we would all love to create for drive, data, and log deconfliction. This is what you want to see.

    They still hadn't set up all the maintenance things I expect, but I can deal with that. So I get back home get all the maint in place and go on with life. But were already several years down the road and I trended the growth and brought it up to management as a good DBA that we have about a 6 month window.

    So we bring in the current SAN sales guy and his tech. That is when we get a real intro to the SAN unit that we have. All that nice design was totally <several bad words> totally useless. The SAN unit was designed that it presented an "aggregate" of the drives that it had available. Or in other words the T: drive's 50GB was assigned out of the fourteen drives in the SAN and you had no control of the individual spindles. If you added another SAN drive set those 14 would also go into the mix, and the "aggregate" controller would use them as needed.

    My level of disgust was such that I had problems answering e-mails on the subject for a week.

    We were bought out so that SW has gone away as well. But a DBA that does not have a decent grasp of hardware and equipment design rates very low in my books.



    ----------------
    Jim P.

    A little bit of this and a little byte of that can cause bloatware.

  • Jim P. (4/5/2014)


    So I was able to change performance from bad to abysmal to an acceptable level just from tuning.

    Just curious... "tuning" what? Hardware, disks, code, indexes, ???, or some combination?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (4/6/2014)


    Just curious... "tuning" what? Hardware, disks, code, indexes, ???, or some combination?

    The indexing was bad. What little maintenance plans were in place had no re-indexing and such in place. The backups weren't being done in any scale to log growth for full recovery model. The logs were outsized for what was needed. The tempdb didn't have log files for the CPU's.

    And then any SP I caught that was written poorly I sent in a change request to the dev's to fix. Not all of them were taken up, but I saw some of them changed in the next version.

    But just the day-to-day fixing the broken issues for customers kept me busy so I couldn't truly afford to do a systematic top to bottom code review. :crying:



    ----------------
    Jim P.

    A little bit of this and a little byte of that can cause bloatware.

  • Jeff Moden (4/5/2014)


    So, yeah... I agree... hardware and hardware tuning are important but, if you really want to make an improvement, remember what most people won't even consider because hardware is supposedly cheaper than talent (and, in the long run, it's not really)... "Performance is in the code".

    While I agree, the real problem that I have seen is in dealing with Vendor code. You can't fix it without violating the terms of the contract and they refuse to fix it. So now you are stuck with finding ways of performance tuning something else. In one recent case, the best solution came down to adding disk space and short stroking the IO. By reducing the used space from 90% on the SAN down to 80%, we were able to double performance. Granted, that wasn't good enough still - but it helped.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply