Pro SQL Server 2012 Practices: Chapter 17

I jumped at the opportunity to write a chapter for this book, Pro SQL Server 2012 Practices, because of all the fantastic people writing and the topics that were covered. I reviewed Chapter 12 a few weeks ago and I’ve been meaning to get this chapter, Big Data for the SQL Server DBA, read & reviewed but I’ve just been sidetracked. I finally noticed it on my Trello board, weeks out of date, and decided to finally get the job done.

Carlos Bossy (b|t) is a great guy and a part of my SQL Family. We run into each other all the time at SQL Saturday’s and other events. I’ve never listened to him present before, and now, after reading his chapter in the book, I’m very sorry. I’ll fix that at my first opportunity.

Carlos starts off nicely, getting a reasonable definition of “big data.” It’s not merely about size, although that has to play a part. He breaks it down to the three “Vs” volume, variety and velocity. It makes sense. Just having a big database is not the same thing as dealing with big data. He takes the time to differentiate between the structures of a relational system, that work just fine, and the needs of big data, which just don’t fit well in relational storage. It’s as clear a delineation as I’ve seen on this topic.

The big data management system that Carlos then focuses on is Hadoop. Which makes sense again. If you’re already working within the Microsoft stack, they’re moving on Hadoop with the HDInsight servers and stuff on Azure. So learning how the Hadoop File System (HDFS) and MapReduce work together to provide you with massive data movement and storage keeps you within the set of tools with which you’re already familiar (although there is a ton of learning to do here). Carlos covers MapReduce in quite a lot of detail because it really is the driving force behind Hadoop and how you’re going to get your big data into the system. He covers the topic by explaining in general terms how things will work, then walks you through an example with additional detail and then walks you through another example with a lot more detail. The accumulative effect works well. The core concepts generally start to stick.

I really like how Carlos addresses the technology from a business solution perspective. It’s not that any one technology is a glorious thing, let’s go use it. It’s that certain business problems require certain tools in order to solve them most efficiently. He makes an excellent case for DBAs understanding technologies like Hadoop in order to implement them where they are appropriate. It’s the kind of arguments I’ve always tried to support. The concepts for management of big data are both simple and really complex and Carlos attacks them both in order to educate. I’ve learned a lot from reading this chapter.

He then walks through different aspects of the type of work you’ll need to support as a DBA using a Hadoop server. There’s going to be some pretty major work around import/export of the data, into and out of Hadoop yes, but into and out of other types of storage, such as good old fashioned relational SQL Server, data warehouses, cubes, and Excel/Sharepoint. Carlos takes us through a detailed examination of the primary tool for this function within the HDInsight/Hadoop infrastructure in Azure, squoop. It’s a great way to learn how this stuff works.

Finally, Carlos goes over some pieces of the future of big data, again focusing primarily on the Microsoft/SQL Server stack and how it relates to Hadoop, but also SSRS and all the tools with which you’re already familiar.

Overall, it’s a great chapter. It’s very solid introduction to a topic that I’m just starting to really wrestle with myself, so I’m happy to have it as a reference.

Book Review: Big Red - Voyage of a Trident Submarine

by Andy Warren

SQLServerCentral.com

Blogs

I've grown up reading Tom Clancy and probably most of you have at least seen Red October, so this book caught my eye when browsing used books for a recent trip. It's a fairly human look at what's involved in sailing on a Trident missile submarine...

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2009-03-10

1,439 reads

Database Mirroring FAQ: Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup?

by Robert Davis

SQLServerCentral.com

Blogs

Question: Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup? This question was sent to me via email. My reply follows. Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup? Databases to be mirrored are currently running on 2005 SQL instances but will be upgraded to 2008 SQL in the near future.

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2009-02-23

1,567 reads

Inserting Markup into a String with SQL

by Phil Factor

SQLServerCentral.com

T-SQL

In which Phil illustrates an old trick using STUFF to intert a number of substrings from a table into a string, and explains why the technique might speed up your code...

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2009-02-18

1,631 reads

Networking - Part 4

by Andy Warren

SQLServerCentral.com

Blogs

You may want to read Part 1 , Part 2 , and Part 3 before continuing. This time around I'd like to talk about social networking. We'll start with social networking. Facebook, MySpace, and Twitter are all good examples of using technology to let...

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2009-02-17

1,530 reads

Speaking at Community Events - More Thoughts

by Andy Warren

SQLServerCentral.com

Blogs

Last week I posted Speaking at Community Events - Time to Raise the Bar?, a first cut at talking about to what degree we should require experience for speakers at events like SQLSaturday as well as when it might be appropriate to add additional focus/limitations on the presentations that are accepted. I've got a few more thoughts on the topic this week, and I look forward to your comments.

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2009-02-13

360 reads

Pro SQL Server 2012 Practices: Chapter 17

Rate

Share

Share

Rate

Pro SQL Server 2012 Practices: Chapter 17

Rate

Share

Share

Rate

Related content

Book Review: Big Red - Voyage of a Trident Submarine

Database Mirroring FAQ: Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup?

Inserting Markup into a String with SQL

Networking - Part 4

Speaking at Community Events - More Thoughts