Could This Work for Databases?

  • Could This Work for Databases?

    The image to the side is from the Computer History Museum, which is a cool place if you're ever in Silicon Valley. It's the early Google corkboard servers. To save money and pack more servers into a smaller space, Larry Page bought cheap motherboards and parts, mounted 4 on a sheet of corkboard, and put each one into a rack space. It worked well for them, though their new data centers have moved to more conventional servers.

    The entire article on how Google works, is very interesting. One of the things that really struck my mind was the way they built their system, which is essentially database of pages and a way to search them. I know that's simplistic, but they do run a database, albeit a specialized one.

    Their design early on was fundamentally against the way that I've built almost every system I've worked on. Google used cheap computers (that I've done), but expected them to fail. So they built their application to handle failure and expect that computers would die.

    RAID?

    Waste of money. Buy more computers instead.

    It's an interesting philosophy. And one that can scale up well. I've heard that some of their data centers had computers that died and they didn't bother replacing them. I've done that as well, mostly because no one noticed they died and I was waiting for a phone call before putting a server back up that wasn't used anymore.

    This is where I think Microsoft is missing the boat. Oracle has their Parallel server, which can scale out mightily, but Microsoft doesn't have a great scale out strategy. I know it's hard, I've tried to solve the problem with application servers before and was looking forward to Application Server, but it doesn't seem to have worked that well and it's been abandoned by Microsoft. I haven't heard much about it and it doesn't seem to have been updated in nearly 3 years.

    Imagine if you could setup your database server and pick "8" servers that would function as hosts for your database. All the data is stored on 3 or more of them, resulting in a 60GB database requiring 25GB or so on each server. Each is a single CPU box, no RAID, so for rack mounts, you could pay about $800 at today's Dell prices. They each only have 1GB of RAM, but the application can query any of them. So you essentially have an 8 way server handling traffic. Granted connections probably wouldn't be maintained between batches, but still it's possible. You'd spend $6400 on this technology and would expect 2-3 of these boxes to just fail.

    Contrast that with an 8 way processor server. Actually Dell doesn't sell 8 ways, but a 4 way 6850, 8GB server, with 200GB of RAID storage, would run over $15k! That's double the cost. At that rate you could easily replace all the servers as they failed and save money!

    Building a "grid" or "dispersed cluster" of SQL Server computers would be hard. I can't imagine the programming necessary to ensure data integrity, commit transactions, etc. But I bet the boys at Microsoft, Oracle, IBM, MySQL, and who knows where else can.

    I just wonder if they'll ever get it working and available in a commercial product. Just imagine SQL Server standard, with a $500 per node upcharge to join a cluster. I'd probably buy more licenses and build one large node for my 4 servers.

    Steve Jones

  • Google (and other search engines) is a special animal from a DB perspective. If data is missing from a single or even relatively small percentage of searches, no one is going to get worked up over it. It is a free service with remarkably small expectations from its users. There are very few applications that can afford this kind of failure and just get away with it. Google just had the good common sense to realize that they could shave costs here and didn't have to be perfect. Even their advertisers can't easily complain if a ad-programmed search doesn't work just right.

    Give new expansion to the saying - "close only counts in horseshoes, hand grenades, nuclear bombs and Google"

  • There is such a product,mostly. Netezza(http://www.netezza.com/products/products.cfm) was either design by some ex-google guys or something. The system is used as google currently.

    They are insanely fast and incredibly expensive. From what my spouse tells me, who uses one, they don't support transactions and are designed for data-reads.

  • Did anyone look at the cost of this "Application Server"? On MS's product information page, it's looking like $3K PER PROCESSOR just for the application server product. If you have to license the applications running on each server in the cluster, that gets even more expensive. So, it looks like you would save hardware acquisition cost initially, but would pay out the nose for the software to run on the cluster. In addition, you would have the additional overhead of maintaining additional physical machines AND you've created more points of failure to track down.

    This model seems very contrary to the consolidation and virtualization that we're seeing out there.

    Personally, I'd rather have a good box and know it's not likely to fail and if one component (ie a couple HD's in a RAID array) that the machine will continue to function and serve clients reliably.

     

  • I recently sat through a presentation by PolyServe, it basically does for SQL Server what you are suggesting (lots of redundancy, shared storage, scale out). 

    The product to me seemed a lot like like VM / VMotion - except that you get the hardware resources you need.

     

  • I agree with drnetwork's post.  I doubted they will use the same technology in their accounting department.

     

  • I've done something similar using replication. Segment ID spaces and replicate across servers. If one server goes down, the load balancer picks up and repoints at another (virtually identical, except for identity spaces being different) server. Say you set your identities to start at 1 on Server 1, 1 billion on Server 2, 2 billion on Server 3, etc. -- you'll need to use bigints, or segment identities into smalleer spaces, but that's just a design decision. You do have replication latency to deal with at that point, and that's real, so individual sessions won't usually cross servers -- but new sessions can be load-balanced across servers with no problems.

    Works fine.

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply