Shapes in the Clouds

  • I've seen a SAN slow down a database horribly, because the whole thing was configured as a single RAID 5 array, and the "separate drives" for data and logs were actually just separate LUNs on the same array. Same for the "separate drive" for the OS, and for the system databases. These were not actually isolated from each other, they were just pieces of the same RAID 5.

    Had the admin change it to a RAID 1 for the OS and executables and system databases, another RAID 1 for the log files, and one more (4 disks) as RAID 10 for the data files. Sped the whole system up remarkably. Lots less space available, but we were only using about 10% of capacity and the databases and such were growing slowly and would continue to do so.

    That's definitely a case of "not knowing the actual set-up" being a very bad thing.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Actually, GSquared, I'd argue you still don't need to know. What you need is a better SLA, and better trained SAN people that actually proactively work on performance. They should be looking to catch bottlenecks, reconfigure disks, and understand that a SAN isn't magic.

    It should work for you, but you don't need the details. You just need the performance.

  • If we are talking about SQL Server doing what the virtual machines are doing underneath the covers, and there needs to be improvement there for true large volume scalabity, then a complete rework of how Sql Server handles memory, io and cpu would need to be done. It would need to start with locking and then come up.

    Right now if an Instance of Sql Server is laid onto a virtual server with san disk's it does not know or care that multiple host are available underneath but it is still limited to the resources that the virtual host can share with it. The locking on the san is controlled by the san and again Sql Server won't care until the san fails to control it, if the san fails to control it.

    So just as dba's rely on the admins of the san and the virtual host's, sql server itself relies on those systems to control the floating of host cpu, memory and the san to control the floating of disk bad devices etc and the locking of the bytes on the physical devices so that corruption does not occur. The hand off has to occur at some point. When setting up the sql server instance, discussions should occur about how the disk, cpu and memory are configured with the appropriate teams and signoff or agreement should occur.

    Then when the issues arise and eventually they will a relationship to those teams has to have been built.

    The cloud really from a Sql Server perspective just seems like a an option -- you need a database here it is. The backend is still that the bakend and it still needs to be fed and groomed by somebody.

    Or am I lost and confused again? Am I missing a part of the picture?

    🙂

    The big thing with Oracle RAC is that it handles the locking between devices. There is a master slave relationship that can float, but always one server or another is in controll. Locking chaos is not allowed. This makes it a bit tougher tracking down locking problems as you have to know which server is holding the master lock. This is very similiar to DEC's lavc technology from the 80's.

    Mark

  • Steve Jones - Editor (4/15/2009)


    Actually, GSquared, I'd argue you still don't need to know. What you need is a better SLA, and better trained SAN people that actually proactively work on performance. They should be looking to catch bottlenecks, reconfigure disks, and understand that a SAN isn't magic.

    It should work for you, but you don't need the details. You just need the performance.

    Well, let's put it this way: SOMEONE needs to know, whether that's the DBA or the SAN admin.

    It's just another case of "the new magic system that makes it so you don't need to know what's going on behind the curtains", which means that another person takes over that knowledge for you. Just more specialization. Having a SAN expert and SLA and all that still feels to me like having to hire a hard-drive expert for your data center. Of course, the same could be said about the DBA, and with the same amount of truth.

    We're just getting more and more specialized as the technology gets more and more complex.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • I think we get more specialized in some ways, and in larger groups, that specialization is needed. You'd like to think that we would get more reliable as well, so we wouldn't need "full-time" specialization. That is the case with some companies. They really need a talented DBA, just not an FTE.

    You are right, though, someone needs to know. Unforatunately, many people I've seen "trained" on SANs are just getting by.

  • I just wonder how long it'll be before the title "Virtual Technician", or "Virtual System Admin", or some such will start showing up on resumes. Or does it already, and I just don't know it yet?

    Which leads to a point I hadn't thought of before. We'll need to start changing some of the interview questions for DBAs as these things advance.

    I consider it a valid question right now, if the job is DB Admin (not DBA = Database Dev), to ask about RAID configurations for OLTP vs OLAP. As SANs become more common, or as other solutions lead to more virtualized/cloud servers, that question will go away. But will it be replaced with something else that applies to virtual database servers, or will it just go away?

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • I think the "Law of Job Titles" says that we'll always have more. They'll definitely be replaced!

  • One thing I find amusing right now:

    The database servers I currently work with are virtual machines. All their hard drives are virtual drives, existing on a single RAID 5 array. The databases aren't transactional enough for that to matter for performance. BUT, the sys admins and prior DBA made sure the data files and log files are on separate virtual drives. They're on the same platters, but have different drive letters, "because you're supposed to have them on separate drives".

    As these technologies advance, I have to chuckle when I wonder how many things like that will be perpetuated, long past when they actually make any sense at all.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • I do not worry about the drive letter, but I do go ahead at the time of creation to split filegroups so that if there is a shift in performance or physical layout of devices, it is easy to move the i/o around. It also gives me a chance to see if a set of data is getting more i/o acitivy than another, which can be helpfull.

  • SANs are very good in delivering large amounts of data very fast. But SQL Server needs a whole lot of small i/o requests handled quickly, without messing around with the order of the requests or notifying SQL Server that's on disk while it's still in some buffer on the way to disk. Read a little more about the constraints that SQL Server puts on a disk subsystem (either local or remote or anywhere you like it) and you'll find out soon enough that most of the optimizations present on a SAN won't boost disk performance at all in this case.

    SQL Server queues many small disk i/o requests but does not stay around waiting until they're finished when work can be done on other SQL queries. Virtual machines are not always equiped to handle this large amount of asynchroneous disk i/o properly and sometimes decide to wait until the virtual disk becomes available before proceeding. Luckily (!) SQL Server will not complain because the entire machine is put on hold, so SQL Server will not even know about the delay. Things generaly get a bit worse when the virtual machine must wait for a SAN to confirm the request.

    It is possible to run SQL Server on a virtual server with a SAN connected to it, but are you sure that your administrators have all the required skills and share enough knowledge to communicate properly about these issues? When SQL moves into the cloud more collaboration of even more administrators will be needed to achive acceptable performance for any application that needs more than a handfull of data. For nine out of ten SQL instances the transaction volumes do not need any significant disk performance but when you're webshop is way too slow, every system administrator will blame SQL Server because their virtual servers are running fine until you prove them otherwise.

    For really BIG companies this won't be an issue too often. At their mid-size brothers harware is bound to much lower budgets and administrator skills do not always catch up with the latest trends. I am very curious about your experience with these companies when the virtualized their servers. But maybe that should be discussed in another thread ...

  • It's one of the problems of "cloud" discussions. We may or may not be talking about the same thing. But I think Larry Ellison made a nice quote on it - "OK - we do clouds. We just never called it that."

    We may be talking about different things, but just understanding if that's true and why that's true is useful (for me anyway).

    Question is - WHY would a cloud be exciting? WHAT differentiates a cloud from an old fashioned time sharing system?

    "My" developers "don't care" where the database lives, because connection is abstracted through DNS and everything is replicated (with block order and transactional grouping preserved) with XOSoft to a remote location. So far so good, at least for a single instance. You don't know if it's on Jupiter or under your desk.

    Problem (and that's why I reference Pascal's SQL-92 book) is we don't get the RDBM benefit of not handling navigation once we have multiple instances - especially on multiple vendors. We may not know WHERE the server is, but we still need to know WHAT it is: this is SQL Server 2005, this is Sybase ASE. And we can't treat them like One Big Database If Your Have the Rights.

    No question we missed the boat on DDBMS. Not even the Open Source people got that one - in fact, most of all they didn't get it ("we only support MySQL (or SQLLite) because we're cultural revolutionaries... or something...")

    Having storage on big fluffy amorphous servers is great, but if I still have to have a connection string for this database and another for that one and then hand join the info in my app, I don't really have a database, I have an ISAM system. It may be an ISAM system on a Cloud, but its still an ISAM system.

    If on the other hand, you put something like Attunity on a cloud-like place, as well as everything else, and maybe a nice Cold Fusion type interface to hide even Attunity, well, now maybe we approach the promise of the relational database.

    It's a shame that just as the hardware is strong enough to bring the promise to life, everyone is trying to force everything back into hierarchies. On clouds.

    Roger L Reid

  • Hi, Steve,

    I'm not sure if this is exactly on-topic - but it sort of fits in between "running SQL on in-house servers" and "running SQL in the clouds". Is it possible to host SSIS on a shared SQL Server installation? I know most Windows-based web-hosts offer SQL Server databases; the better ones run SQL Server on a dedicated server within their web farm. Is there something about the BI stack, and SSIS in particular, that makes it difficult, or dangerous, to run in the same environment? It seems like it would be a good value-added option for the Windows-based hosts to offer the full stack to their clients, and it would be cheaper then a virtual dedicated SQL Server, especially for smaller sites.

    [font="Tahoma"]Eric Flamm, Flamm Consulting[/font]

Viewing 12 posts - 16 through 26 (of 26 total)

You must be logged in to reply to this topic. Login to reply