Hadoop and SQL Server

There has been a lot of media attention to Hadoop in the last few years. In fact, Microsoft has spent a lot of resources to build the HDInsight version of the platform and integrate it into SQL Server. I've read quite a bit about how to setup and query with Hadoop, but haven't used it for a real project. In fact, it seems relatively few people seem to be finding it to be a replacement for, or better solution than, SQL Server. We published a great introduction to Hadoop written by David Poole awhile back, and recently I ran across another nice writeup from someone I think is a very talented SQL Server professional.

Michelle Ufford (@sqlfool | b) wrote a piece asking if Hadoop is better than SQL Server. Michelle notes that Haddop is a different platform, and it's a great way to consume lots of data. In fact, she has a graph from EMC talking about the data explosion and how we still at the low end of the exponential growth curve of data production. It's a sobering thought and I tend to agree with Michelle and EMC on the growth of data.

I had hoped Microsoft would do more with Filestream and Filetable to help meet the challenges of large volumes of data, but it seems that very little has been done with those features in the last version of SQL Server. I have little hope that additional investment will come in the future. Instead, it seems Microsoft is leaning towards using Hadoop as one way to process and consume large volumes of data.

I wrote about Hadoop in 2009 when it was a young project, and I suspected it would enhance and work with, rather than supplant, the RDBMS. There are certainly other technologies out there to help with this, but if you are working with large volumes of data that exceed what a single instance of SQL Server can handle (at a reasonable cost), you might think about learning a bit about Hadoop. It might not solve your issues, but if it can, it would be good to know something about it.

A Good Use for Hadoop

by Steve Jones

SQLServerCentral.com

Editorial

Netflix has a good use for Hadoop. Loading PB of data.

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2016-02-25

165 reads

Discuss

The Power of Hadoop

by Steve Jones

SQLServerCentral.com

Editorial

Hadoop is an open source framework for working with data, and one that Microsoft has adopted. Is it worth using in your environment? Steve Jones thinks you should investigate it.

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2012-02-27

793 reads

Discuss

Are you looking to Hadoop?

by Steve Jones

SQLServerCentral.com

Editorial

Hadoop is an interesting new software project in the Linux world that deals with large data sets. Steve Jones wonders if anyone in the SQL Server world has started working with it.

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

(1)

You rated this post out of 5. Change rating

2009-10-08

1,350 reads

Discuss

Contract or Perm

by Steve Jones

SQLServerCentral.com

Editorial

If you are accepting a DBA position, does it make sense to work as a contractor or permanent employee?

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2007-11-21

250 reads

Discuss

Mini-Me

by Steve Jones

SQLServerCentral.com

Editorial

Will the next version of Windows be a "Mini-Me" version of Vista? Who knows, and it's too early to tell, but apparently there's a mini-kernel version of Windows 7, the one after Vista, which fits into 25MB on disk. That's a touch lower than the 4GB that Vista takes up. Granted it's not a full […]

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2007-10-25

145 reads

Discuss

Hadoop and SQL Server

Rate

Share

Categories

Share

Rate

Hadoop and SQL Server

Rate

Share

Categories

Share

Rate

Related content

A Good Use for Hadoop

The Power of Hadoop

Are you looking to Hadoop?

Contract or Perm

Mini-Me