• adrienne.lore (3/15/2012)


    Is fragmentation seriously no longer a problem with SANs?

    It depends :). Fragmentation primarily slows things down by a logically sequential operation being effectively a random operation at the physical level (my testing shows this does affect both spindles and SSD's, though spindles much, much more than SSD's).

    On spindles, you end up with your heads moving back and forth from the (closer to the) inside to (closer to the) the outside of the disk multiple times during your operation, and perhaps moving a larger distance. Advanced file systems (and SAN's) use something similar to what back in the old Netware days was termed "elevator seek" or "elevator scan"; i.e. when you have a bunch of requests, order them for minimal head movement, which reduces aggregate latency (though any one request may suffer).

    If your particular SAN LUN is part of a pool of disks shared with dozens of other databases, servers, some file servers, and the Exchange box, and those other apps are always active, then fragmentation of your particular files or the data in them is unlikely to matter as much, since the heads are going to have requests from all over the disk most of the time, anyway.

    If you have dedicated spindles for your database alone, and some of your IO is logically sequential (scans, not seeks, of multi-MB or multi-GB indexes), and so on and so forth, then fragmentation at an internal and external level may well play a part in aggregate throughput.

    Linchi's analysis was well done, but limited in a couple of ways. For one, my very cursory reading indicates it's not unlikely the DB files created used sequential free space chunks, instead of the first chunk in the middle of the disk, the second at the beginning, the third at the end, the fourth at the end, the fifth at the beginning, etc. etc. For another, Linchi only created a 10GB file and used 4.5GB of it, and I didn't see how large the SAN's cache was - an 8GB cache plus readahead on a 10GB database could skew results significantly, and I believe the Symmetrix DMX-2's supported up to a 256GB cache. To really see what happens on disk, you need to disable or overwhelm the cache.