Big Data - Cost of acquiring skills

Question

Big Data - Cost of acquiring skills

David.Poole

SSC Guru

Points: 75898
More actions
April 13, 2014 at 9:47 pm

#104235

Comments posted to this topic are about the item Big Data - Cost of acquiring skills
LinkedIn Profile

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply

Gary Varga SSC Guru Points: 82166 More actions · Answer 1

Most R&D or experimentation can be done on a "no licence cost" basis. The one cost that NEVER seems to be allowed is time. The nearest I have seen is when a project manager allows a little time for a team to get up to speed with a new technology.

One of the thoughts must be given weight when selecting software to use for experimentation, R&D or a pilot is what is the plan if it is successful? Stick with that technology? Migration to a different technology? etc.

It is not necessarily a cheap way of doing it when the management shout "Ship it!!!" and the support team refuse to support it without additional training as it is nothing they have ever supported before.

NOTE: This is not an anti-OSS rant but a piece of cautionary advice.

Gaz

-- Stop your grinnin' and drop your linen...they're everywhere!!!

Eric M Russell SSC Guru Points: 125522 More actions · Answer 2

If management allows an hour for lunch and places no restrictions on how early one can enter the office or how late one leaves, then there can always be time for study. Don't ask for time, just find it and take it.

"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

David.Poole SSC Guru Points: 75898 More actions · Answer 3

Eric, although there is always tme to squeeze something in and learn I think a more formal recognition of the need for experimentation time is needed.

There has to be quite a bit of rigour and due diligence around experimentation because if an experiment is successful then the outcome can represent disruptive change.

When a disruptive change takes place there will be a minority in favour of it, probably 40% who will accept it with a reasonable amount of proof, probably 40% who will only accept it IF there is extensive proof and the rest will out-right reject it.

Normally the resisters are the domain experts who have a clear view of the magnitude of the impacts of the change. I've been on that side of the fence myself and know that going into big organisational change without the due diligence is an extremely painful experience. Shining a harsh light on an experiment can help to thrash out gaps in the methodology and thought processes behind the experiment. Equally, it can also reveal something that ends up being a really good solution to the problem that inspired the experiment in the first place.

LinkedIn Profile

Gary Varga SSC Guru Points: 82166 More actions · Answer 4

Eric M Russell (4/14/2014)
If management allows an hour for lunch and places no restrictions on how early one can enter the office or how late one leaves, then there can always be time for study. Don't ask for time, just find it and take it.

Good point but I was referring to pilot projects rather than learning. Learning is easier to do piecemeal. In fact it has been proven to be a better learning technique with deeper understanding often attained. I mean more of a structured, proven assessment of use.

Big data is as much about systems design as it is data management.

Gaz

-- Stop your grinnin' and drop your linen...they're everywhere!!!

Eric M Russell SSC Guru Points: 125522 More actions · Answer 5

I've got half a dozen retired PC boxes stacked in a corner under my desk on which I'm installing a Ubuntu / Hadoop cluster. My intention is to prove whether or my skunkworks cluster can complete 3 concurrent aggregate queries on a TB of data in less time than the production SQL Server Enterprise instance that we currently use for a staging environment. This type of "project" would normally never get approved, which is just as well, because I've had to put it on the back burner for weeks at time while I focus on my real work.

"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

Gary Varga SSC Guru Points: 82166 More actions · Answer 6

Eric M Russell (4/14/2014)
I've got half a dozen retired PC boxes stacked in a corner under my desk on which I'm installing a Ubuntu / Hadoop cluster. My intention is to prove whether or my skunkworks cluster can complete 3 concurrent aggregate queries on a TB of data in less time than the production SQL Server Enterprise instance that we currently use for a staging environment. This type of "project" would normally never get approved, which is just as well, because I've had to put it on the back burner for weeks at time while I focus on my real work.

Glad that you have that kind of freedom. You do yourself credit using it that way.

Gaz

-- Stop your grinnin' and drop your linen...they're everywhere!!!

David.Poole SSC Guru Points: 75898 More actions · Answer 7

Eric M Russell (4/14/2014)
I've got half a dozen retired PC boxes stacked in a corner under my desk on which I'm installing a Ubuntu / Hadoop cluster. My intention is to prove whether or my skunkworks cluster can complete 3 concurrent aggregate queries on a TB of data in less time than the production SQL Server Enterprise instance that we currently use for a staging environment. This type of "project" would normally never get approved, which is just as well, because I've had to put it on the back burner for weeks at time while I focus on my real work.

It probably won't. If you've got an optimised structure recordset, particularly if you've got page compression on, then a scan through it will be fast.

Hadoop really comes into its own when you have much larger data volumes or you have data of a structure that doesn't naturally fit into a straight tabular format.

I was quite disappointed to find that 1 years worth of web site page impressions (roughly 1 billion records) on AWS 3 node EMR took about 3x as long as the same query on SQL Server 2005. The problem is that the type of query, the type of data and the volume of data didn't present a Hadoop shaped problem.

The beauty of your setup is that you will get to play with a genuine Hadoop cluster and learn the tricks and pitfalls.

One thing you learn quite early on is that having a process to merge smaller files up into bigger ones improves performance dramatically.

LinkedIn Profile

Doctor Who 2 SSCrazy Eights Points: 8177 More actions · Answer 8

I didn't know that SQL Server Developer Edition was something that one could purchase at a reduced rate. You gave the cost in British pounds - what's it cost in US dollars and where do you get it?

Rod

Eric M Russell SSC Guru Points: 125522 More actions · Answer 9

Doctor Who 2 (4/18/2014)
I didn't know that SQL Server Developer Edition was something that one could purchase at a reduced rate. You gave the cost in British pounds - what's it cost in US dollars and where do you get it?

It's about US $50, but maybe even lower with academic discount pricing. Sometimes they hand it out as raffle prizes at user group meetings or in swag bags at tech conventions.

"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

Doctor Who 2 SSCrazy Eights Points: 8177 More actions · Answer 10

Eric M Russell (4/18/2014)
Doctor Who 2 (4/18/2014)
I didn't know that SQL Server Developer Edition was something that one could purchase at a reduced rate. You gave the cost in British pounds - what's it cost in US dollars and where do you get it?
It's about US $50, but maybe even lower with academic discount pricing. Sometimes they hand it out as raffle prizes at user group meetings or in swag bags at tech conventions.

Where can I get it?

Rod

Eric M Russell SSC Guru Points: 125522 More actions · Answer 11

Doctor Who 2 (4/18/2014)
Eric M Russell (4/18/2014)
Doctor Who 2 (4/18/2014)
I didn't know that SQL Server Developer Edition was something that one could purchase at a reduced rate. You gave the cost in British pounds - what's it cost in US dollars and where do you get it?
It's about US $50, but maybe even lower with academic discount pricing. Sometimes they hand it out as raffle prizes at user group meetings or in swag bags at tech conventions.
Where can I get it?

I get it from my employer's MSDN subscription, but you can order it from Microsoft, online retailers. I've seen it on Amazon and eBay too.

It looks like you can purchase and download SQL Server 2014 Developer Edition direct from Microsoft for $59.

http://www.microsoftstore.com/store/msusa/en_US/pdp/SQL-Server-2014-Developer-Edition/productID.298540400

"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

David.Poole SSC Guru Points: 75898 More actions · Answer 12

Doctor Who 2 (4/18/2014)
I didn't know that SQL Server Developer Edition was something that one could purchase at a reduced rate. You gave the cost in British pounds - what's it cost in US dollars and where do you get it?

Amazon if you don't have an MSDN subscription

LinkedIn Profile

Doctor Who 2 SSCrazy Eights Points: 8177 More actions · Answer 13

Doctor Who 2

SSCrazy Eights

Points: 8177

April 19, 2014 at 12:14 pm

#1707278

Thank you Eric and David.

Rod