SQL Clone
SQLServerCentral is supported by Redgate
Log in  ::  Register  ::  Not logged in

Are you looking to Hadoop?

By Steve Jones,

HadoopI hadn't even heard of Hadoop before, but there was a Hadoop World conference recently and it came to my attention on Twitter. I saw a quote that said "JP Morgan Chase is counting on an order of magnitude savings on data warehousing. " Since it's primarily a Linux based system and only set up for development, not production, on Win32 systems, perhaps that's not surprising.

I tried to read through the quickstart on Apache's site for the common core installation and walk through a few examples, but it's a little hard to tell what exactly the buzz is about. Wikipedia was more help, pointing me to the MapReduce papers that Google published. I'll see if I can work through them  at some point. Hadoop is available under a free license and the list of companies using it for large data set processing is impressive: Yahoo!, Amazon, Facebook, and more.

So what's the purpose? Hadoop appears to allow clusters of servers to perform data processing very efficiently. It's built on it's own distributed file system that scales to handle petabytes of data. That might seem like more data than you and I will ever need to work with, but I remember when it was a challenge to get enough disk drives together to assemble a terabyte in a server. Now I have 1.5TB in my desktop, with room for more.

It's an interesting project, and with data volumes constantly growing, I wonder when we'll see a similar technology in Microsoft's data processing platform. They already purchased a search technology company based on Hadoop, and we might see this used in Bing.

I expect this type of processing, and others like the StreamInsight features in SQL Server 2008 R2, to complement, rather than supplant the traditional SQL database engine.

Steve Jones

The Voice of the DBA Podcasts

Everyday Jones

The podcast feeds are available at sqlservercentral.mevio.com. Comments are definitely appreciated and wanted, and you can get feeds from there.

You can also follow Steve Jones on Twitter:

Overall RSS Feed: or now on iTunes!

Today's podcast features music by Everyday Jones. No relation, but I stumbled on to them and really like the music. Support this great duo at www.everydayjones.com.

I really appreciate and value feedback on the podcasts. Let us know what you like, don't like, or even send in ideas for the show. If you'd like to comment, post something here. The boss will be sure to read it.

Total article views: 1312 | Views in the last 30 days: 1
Related Articles

Distributed Computing Principles and SQL-on-Hadoop Systems

A look at SQL-On-Hadoop systems like PolyBase, Hive, Spark SQL in the context Distributed Computing ...


Introduction to Hadoop

Hadoop was created by the Apache foundation as an open-source software framework capable of processi...


Hadoop Fundamentals

Have you heard about Hadoop, but never really understood what it’s all about? Do you need to learn ...


SQL Server Podcasts

Great news, I have decided to do some podcasts on the fundamentals of SQL Server, my aim is to help....


Good Intro Podcast on Hadoop

Have you heard about Hadoop but don't know much about it? What about "big data?" Would you like an i...