Shapes in the Clouds

  • Comments posted to this topic are about the item Shapes in the Clouds

  • Around 4 years ago, I read something about developers who rented database servers in the Internet. They had a choice of MySql and SQL Server, I think. It's becoming common now and lower in price.

    How is it different from the article, I am not sure but I guess providers would be giving different choices to DBAs.

    A side question is, how much bandwidth to estimate? I remember the dial-up ISP optimization problem a decade ago: lack of modems during peak hours, plenty of unused modems in the middle of the night.

  • I've read the blogs and find the debate about the particulars of this subject to be interesting, and its nice to read what Steve has posted here - but all I see are the techies getting excited at this point and I can tell you from the management side that IT leaders are not buying into this idea right now. Why? Because it suggests giving up control of vital, private and often sensitive data.

    The concept is great, but if some focus does not start to shine on the management issues, this is going to be hard to sell into the real business world. I am thinking of one of our clients who has some 6 petabytes of vital company data currently hosted on their own secure servers. If you think someone is going to go in there and tell them that that data will be hosted where? In the cloud? What cloud? Where? ... You can be sure that will be a VERY short business call - it ain't gonna sell.

    Right now this is a "we can do this" idea. But there is very little out there that talks about "should we do this", and "how do we do this so it can be sold".

    I don't know if its a fair comparison, but lets not forget that Windows Vista was also a "great idea". Unfortunately, as one can only hope Microsoft has finally learned, simply being able to do something, does not mean you should do it, or that if you do do it, that it will take hold.

    There's no such thing as dumb questions, only poorly thought-out answers...
  • Ok, I'm lost. I can understand an end user not really caring where the connection physically ends up at, but I would think a DBA would have to care.

    At some point the URL has to ultimately represent and link to a physical data on a physical device doesn't it?

    Doesn't economics mean that your basically going to be restricted to a limited number (2 or 3) of physical servers more or less mirroring each other in some manner.

    Perhaps I am showing my ignorance of cloud computing. I understand it more as an etherial front end or a form of parrallel processing where multiple devices work together to give the impression of one massive device. However, at some point it all has to come down to actual physical materials being arranged in one direction or the other.

    Or, would it be more like a super raid stripping system where one only needs to grab stripes from enough different physical devices to be able to rebuild the complete information set. Probably work well enough for reads, but writes across an indeterminate number of physical devices would seem to be a pain.

    Or am I just way the hell, totally off track on this one?

  • One thing that does sort of turn me off on this idea is an experience I had a few years ago. We were migrating a business-critical app from Access to SQL Server, and the decision was made to let the database be hosted by an outside company, since that was a lot cheaper than paying for servers on-site.

    The migration went well, the application worked, but it was SLOW. A few milliseconds latency may not seem like much, but when it's a heavy OLTP app being used by a lot of people, it suddenly adds up to a very slow application.

    A few months later, fed up with the performance, we moved the database in-house. Went from SLOW to screaming fast, with no other change that that latency going away. (Bandwidth never even came into it. We were using about 10% of the pipe at peak load.) Everyone was thrilled.

    If cloud applications cause that kind of latency effect, I don't think they'll work for OLTP apps. I don't know enough about them to know if that'll be the case, but if the whole idea is that you won't know where the data is nor where the web page is, what if they end up on machines that are thousands of miles away from each other?

    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • I'm pretty skeptical myself. Ignoring for now the security/privacy concerns inherent in third-party data storage, performance would (i think) be a major issue.

    I've been in a siutuation with some similarity to the one GSquared described: our client opened up a second office with about twice as many users connecting to the database as in the original location. Every couple of weeks we'd get calls from that second site complaining about connection slowness. Everytime, the assumption was that there was a problem in the database. Everytime, it turned out not to be the case. Bandwidth was fine; it was the latency. :satisfied:

    Additionally, how can you have control over the hardware configuration if the machine hosting your database is [indeterminant]? How could you optimize performance for TempDB without knowing what drive it's on, or what the specs for that drive are? how can you determine file size limits and growth when you don't know how much room there is?

    It's possible that someone will be able to provide tools/services that can allay some of these concerns, but i haven't heard anything close to it yet.

  • Surprisingly, most of your concerns were what I heard in 95-99 about web apps. I think this will follow a similar path. There are plenty of people that host their own web apps, for example. SSC is self hosted. There are also people that buy a virtual web front, never worrying about any directory structure outside their site.

  • blandry (4/14/2009)

    ...Because it suggests giving up control of vital, private and often sensitive data.


    Right now this is a "we can do this" idea. But there is very little out there that talks about "should we do this", and "how do we do this so it can be sold".

    Good points, but takes sensitive data every day. Maybe they're an anomaly, but I think there are others out there, just not as big.

    I'm also not sure we can do this yet, or it's mature. It's probably more like 94-95 web technology. It's not there yet, but it's an idea.

  • bob.willsie (4/14/2009)

    Ok, I'm lost. I can understand an end user not really caring where the connection physically ends up at, but I would think a DBA would have to care.

    I've got some more writing about this later in the week, but someone does need to know what is where. But not necessarily the DBA.

    SANs are the first step of DBAs not knowing or caring about storage. at least they shouldn't. That might speak more to the training and monitoring of the SAN guys, but the server admins just need an SLA for performance of the disks.

    I think that might happen with SQL Server. Imagine if you could designate 4 servers to be SQL Server and install SQL once as a cloud service. Then you'd deploy to this cloud service, not knowing, or worrying about if your database was being server from node 1, 2, or even both.

    Granted someone might need to troubleshoot, but you'd hope that SQL would manage itself, firing hardware alerts when something didn't perform well. The hardware guys would fix things, leaving the DBA to manage, well, data (and security, packages, etc)

  • Ah, now I understand better. I've spent most of my career in small shops as a "jack of all trades, master of none."

    Consequently I tend to think all the way down to the platter.

    Looking at it from your perspective I understand better.

    Thanks, Bobw

  • This actually sounds like an exciting idea. Maybe not so much having your SQL Server hosted on a third-party, but basically this looks like SQL Server virtualization. Like Steve mentioned you can have an intranet cloud to mitigate the privacy and performance concerns.

  • We really have all of this available now. With virtual servers, we have multiple servers that host up to a virtual server that is seen by sql server. Sans doing the same on the disk side. The only down side is that when an issue, performance or otherwise a good relationship needs to be in place between the dba's, san admins and the server admins or all things will break down. (picture finger pointing here) These are really human issues and not technical issues. If you lay Sql Server on a hosted virtual server with a san then it does not need to worry about the low level locking structure the san controllers have to. Sql Server does not worry about how the cpu or memory is sliced and diced because the virtual hosting mechanism(software or hardware depending upon configuration) has to. That leaves security and selling the concept. The hosting company or the on site administrators have to be able to be secure. They have to be secure, whether it is in the Cloud on in the data center down the hall. My site, your site, if they aren't secure yesterday, today or tomorrow; then you've got trouble.

    Bottom line, there does not have to be a fundamental change to Sql Server if this is down on hosted virtual machines with san devices. If it is down in some other manner then Sql Server will need changes to it's locking process and it's writting process.

    This all sank home to me at the Best of PDC meeting that was held here in Denver at the movie theater.


    It is all cool: 😎

  • Steve, while this is all possible - and in fact exists - I have a couple of concerns about how the issue is framed.

    You speak about "not knowing which SQL Server you are connecting to". The problem is it still assumes MS SQL Servers. The real issue we face today was well framed by Fabian Pascal in 1992 (don't start booing...for one thing, the book I am referencing is one in which he vigorously defended SQL as a language).

    The relational model freed the developer from the navigation details of "the database". What has happened since is that "the database" has become "the databases"; multiple databases on the same or different servers, which may be running SQL Server, Oracle, DB2, or simply be some kind of tabular file.

    In other words, "the database" grew so it was larger than the server, even though it is represented as if it were multiple databases. In his chapter on Distributed Databases, 17 years ago, showed the need for DDBMS - distributed database management systems - that would free the developer from the navigational details of 'which database on which server'. (Fabian Pascal, Understanding Relational Databases with Examples in SQL-92, 1993. It's available for about $2 used, it's perhaps the best intro to RDB out there aside from the Manga book).

    Interoperability wasn't in any vendors short term interest, so the DBMS vendors never got into it. Meantime, ODBC came about, the concept of web services (which is a fine layer to separate on) came in, and products like Attunity came about. Attunity doesn't seem to have gotten substancial market traction, unfortunately, but the concept they followed clearly works, as every time someone came up with a new way to represent tabular date (eg XML), they just wrote the one module needed.

    Developers shouldn't know from clouds or vendors or servers. They should know how to get data by stating "what", without any "where" or "how". We had that early on with DBMSs, but we've now lost it. Just putting it on a cloud isn't going to solve that, we need to be able to apply relational operators between servers without realizing we are doing so.

    Debating the cloud won't get us there. The whole point of database theory is that We Don't Care About Implementation. And if you are adding complexity for DBAs, it won't be more work for DBA's - no one wants to spend money on that - it is just more frustration and burn out for DBAs.

    Of course, I could be wrong.

    Roger L Reid

  • Roger,

    thanks and I think some of this exists, and in ways, but I'm not sure we really have a cloud system that extends down to DBAs. We're still very server centric in our administration. The vision I have is for a cloud of services, that migrates the way VMs do today between physical machines. But this is a level of abstraction higher where as the application moves. I know this can happen in some *Nix apps on a Beowulf or similar cluster, but I'm not sure there's a db that does this. Perhaps Oracle RAC?

    I do agree that vendors have somewhat messed up the model over time. I'd like to see that fixed with some standards. Maybe something that's like SOAP for db services.

  • In practice, as a DBA I am more involved with SANs than I should be, knowing a lot more about them than I should know. SQL Server has been and will be tightly bound to its resources (storage, memory, processing power) and any change in any of these resources does affect its performance (either positive or negative). I have seen migrations of SQL Server to SANs or virtual servers fail where even Exchange had been migrated flawlesly. I think you're right about cloud computing, but finding bottlenecks and fixing performance degradation will be even harder in these environements. The same warning applies here as for SANs and virtual servers: try before you buy!

Viewing 15 posts - 1 through 15 (of 26 total)

You must be logged in to reply to this topic. Login to reply