Click here to monitor SSC
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Flying high on the Big Data hot-air


Flying high on the Big Data hot-air

Author
Message
krowley
krowley
SSC-Enthusiastic
SSC-Enthusiastic (132 reputation)SSC-Enthusiastic (132 reputation)SSC-Enthusiastic (132 reputation)SSC-Enthusiastic (132 reputation)SSC-Enthusiastic (132 reputation)SSC-Enthusiastic (132 reputation)SSC-Enthusiastic (132 reputation)SSC-Enthusiastic (132 reputation)

Group: General Forum Members
Points: 132 Visits: 429
Looking forward to the Stairway series. As an "accidental" DBA who loves this field I am always looking to learn more.
Rod
Rod
Ten Centuries
Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)

Group: General Forum Members
Points: 1132 Visits: 1931
Interesting article. I've got one question, which I ask out of my ignorance, what is R?

Kindest Regards,RodConnect with me on LinkedIn.
PHYData DBA
PHYData DBA
Mr or Mrs. 500
Mr or Mrs. 500 (556 reputation)Mr or Mrs. 500 (556 reputation)Mr or Mrs. 500 (556 reputation)Mr or Mrs. 500 (556 reputation)Mr or Mrs. 500 (556 reputation)Mr or Mrs. 500 (556 reputation)Mr or Mrs. 500 (556 reputation)Mr or Mrs. 500 (556 reputation)

Group: General Forum Members
Points: 556 Visits: 533
I am not a huge fan of random big data projects for useless over caffeinated marketing ideals.

I do like projects like a certain Health Industry project started in 2006 that has become invaluable and very relevant with our current reforms.
http://www.advisory.com/Technology/Crimson

I would dare to think that other industries, consumers, and regulators could find a use for similar collections of data and it's analysis. Cool

Anything worth doing is worth doing right since most things done wrong are worthless.
PHYData DBA
PHYData DBA
Mr or Mrs. 500
Mr or Mrs. 500 (556 reputation)Mr or Mrs. 500 (556 reputation)Mr or Mrs. 500 (556 reputation)Mr or Mrs. 500 (556 reputation)Mr or Mrs. 500 (556 reputation)Mr or Mrs. 500 (556 reputation)Mr or Mrs. 500 (556 reputation)Mr or Mrs. 500 (556 reputation)

Group: General Forum Members
Points: 556 Visits: 533
Phil Factor (7/30/2013)
@wim.bekkens

Thanks for that.
For a simple example, take a look at this series that is now coming out on Simple-talk. It walks you through an example application that involves using R to report KPIs in a SQL Server database.
Creating a Business Intelligence Dashboard with R and ASP.NET MVC: Part 1
Creating a Business Intelligence Dashboard with R and ASP.NET MVC: Part 2


Phill,
R was cool but unless you are working at the NIS isn't it dated?

When compared to some of the newer and highly maintained Graphical Stat Display tools such as the free Sigma Plot MySystat http://www.systat.com/MystatProducts.aspx R feels like an amber screen from the 80's.

Don’t get me wrong. It was awesome and I used it.
Now there are IMHO better tools that require less heavy lifting.

If anyone is interested in a free large scale Database solution used by Netflix, Twitter, eBay, reddit, Cisco, etc... http://cassandra.apache.org/ Cool
Phil Factor
Phil Factor
Right there with Babe
Right there with Babe (745 reputation)Right there with Babe (745 reputation)Right there with Babe (745 reputation)Right there with Babe (745 reputation)Right there with Babe (745 reputation)Right there with Babe (745 reputation)Right there with Babe (745 reputation)Right there with Babe (745 reputation)

Group: General Forum Members
Points: 745 Visits: 2937
@Rod At Work

The R programming language is a GNU open-source project designed for statistical analysis of data. It supports a huge number of extensions in user-created packages that give it a bewildering versatility.
See http://en.wikipedia.org/wiki/R_programming_language
Although I can sympathise with PHYData DBA I reckon that it is important to take R seriously, whatever else you use as well. Not only is it valuable purely for analysis of variance and factor analysis but it is widely used in universities for the teaching of parametric statistics. There are a huge number of resources, Books, samples and videos around. It isn't going away!


Best wishes,

Phil Factor
Simple Talk
Miles Neale
Miles Neale
SSCrazy
SSCrazy (2.7K reputation)SSCrazy (2.7K reputation)SSCrazy (2.7K reputation)SSCrazy (2.7K reputation)SSCrazy (2.7K reputation)SSCrazy (2.7K reputation)SSCrazy (2.7K reputation)SSCrazy (2.7K reputation)

Group: General Forum Members
Points: 2666 Visits: 1694
Hi Phil,

I look forward to the replay of this editorial in 5 years and the reaction others have to it then.

As to the comment now, you say that "For years, we in the database industry have struggled..."
and this may well continue as long as we try to limit data strictly to a database. Data has and will continue to be both the combination of databases and other data. As you state, we have known that for a considerable amount of time.

Also you indirectly state that we have had everything we need to deal with Big Data for some time. Note that with R and SQL Server we lack only two things. If we can get it all using R and SQL Server we must be able to use then to search all data generated by all users on all platforms, all email, memos, white papers, websites, and generated reports. And we have had that capability for some time?

I know that there is hype, and I agree that we should point and advise, and we really need to get on board and get all we can out of this current wave. There we have complete agreement. But we also need to look very closely with what we have and determine if the tools we used yesterday are able to manage horizontally scaled data correctly and with a large enough sample to do basic analysis as well as meet the demands of investigations where every element of certain criteria is require to be presented. If the tools of yesterday cannot do it and those being developed today will also fall short, it might be good to get involved as you have in trying to scope and define the tools of the future.

Thanks for getting this on the table, good to see and hear.

Not all gray hairs are Dinosaurs!
RobertYoung
RobertYoung
SSC-Enthusiastic
SSC-Enthusiastic (100 reputation)SSC-Enthusiastic (100 reputation)SSC-Enthusiastic (100 reputation)SSC-Enthusiastic (100 reputation)SSC-Enthusiastic (100 reputation)SSC-Enthusiastic (100 reputation)SSC-Enthusiastic (100 reputation)SSC-Enthusiastic (100 reputation)

Group: General Forum Members
Points: 100 Visits: 232
PHYData DBA (7/30/2013)
Phil Factor (7/30/2013)
@wim.bekkens

Thanks for that.
For a simple example, take a look at this series that is now coming out on Simple-talk. It walks you through an example application that involves using R to report KPIs in a SQL Server database.
Creating a Business Intelligence Dashboard with R and ASP.NET MVC: Part 1
Creating a Business Intelligence Dashboard with R and ASP.NET MVC: Part 2


Phill,
R was cool but unless you are working at the NIS isn't it dated?

When compared to some of the newer and highly maintained Graphical Stat Display tools such as the free Sigma Plot MySystat http://www.systat.com/MystatProducts.aspx R feels like an amber screen from the 80's.

Don’t get me wrong. It was awesome and I used it.
Now there are IMHO better tools that require less heavy lifting.

If anyone is interested in a free large scale Database solution used by Netflix, Twitter, eBay, reddit, Cisco, etc... http://cassandra.apache.org/ Cool


<sermon>
For the Point and Click crowd, I suppose SAS/SPSS/MiniTab/Excel and the like feel "modern". In terms of implementing (cutting-edge) stats, R has no peer. And, it's taking market share from all of the closed-source alternatives. As I mentioned earlier, Oracle has followed in the footsteps of Postgres with its integration strategy. RStudio does a good job of integrating the necessary bits of R. No Point&Click as yet, however. And not likely, either. The momentum in R is toward R as programming language, rather than R as stat command language. Julia is the current front runner. As such, integrating into the sql/database engine, rather than variations on ODBC from RStudio/etc. is the way forward.
</sermon>
PHYData DBA
PHYData DBA
Mr or Mrs. 500
Mr or Mrs. 500 (556 reputation)Mr or Mrs. 500 (556 reputation)Mr or Mrs. 500 (556 reputation)Mr or Mrs. 500 (556 reputation)Mr or Mrs. 500 (556 reputation)Mr or Mrs. 500 (556 reputation)Mr or Mrs. 500 (556 reputation)Mr or Mrs. 500 (556 reputation)

Group: General Forum Members
Points: 556 Visits: 533
RobertYoung (7/30/2013)
PHYData DBA (7/30/2013)
Phil Factor (7/30/2013)
@wim.bekkens

Thanks for that.
For a simple example, take a look at this series that is now coming out on Simple-talk. It walks you through an example application that involves using R to report KPIs in a SQL Server database.
Creating a Business Intelligence Dashboard with R and ASP.NET MVC: Part 1
Creating a Business Intelligence Dashboard with R and ASP.NET MVC: Part 2


Phill,
R was cool but unless you are working at the NIS isn't it dated?

When compared to some of the newer and highly maintained Graphical Stat Display tools such as the free Sigma Plot MySystat http://www.systat.com/MystatProducts.aspx R feels like an amber screen from the 80's.

Don’t get me wrong. It was awesome and I used it.
Now there are IMHO better tools that require less heavy lifting.

If anyone is interested in a free large scale Database solution used by Netflix, Twitter, eBay, reddit, Cisco, etc... http://cassandra.apache.org/ Cool


<sermon>
For the Point and Click crowd, I suppose SAS/SPSS/MiniTab/Excel and the like feel "modern". In terms of implementing (cutting-edge) stats, R has no peer. And, it's taking market share from all of the closed-source alternatives. As I mentioned earlier, Oracle has followed in the footsteps of Postgres with its integration strategy. RStudio does a good job of integrating the necessary bits of R. No Point&Click as yet, however. And not likely, either. The momentum in R is toward R as programming language, rather than R as stat command language. Julia is the current front runner. As such, integrating into the sql/database engine, rather than variations on ODBC from RStudio/etc. is the way forward.
</sermon>

I agree with a lot of what you say...
edit -- I have to admit that R has come a long way. I have done some reading about its latest advances as it's own programming language. Currently the creator of S, John Chambers, is working on the R team.
http://en.wikipedia.org/wiki/R_(programming_language)

I like SciLab for a GNU compatible library. It is based of off Matlab syntax so code reuse or porting from MatLab makes things easier. It is also very current and highly maintained.

Julia does have great promise. The fact that it is a true AST makes it even better. Hopefully it becomes embraced and adopted soon as well as S, R, Matlab, C, and Fortran. http://stats.stackexchange.com/questions/25672/does-julia-have-any-hope-of-sticking-in-the-statistical-community
Rod
Rod
Ten Centuries
Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)

Group: General Forum Members
Points: 1132 Visits: 1931
Thank you, @Phil Factor, for the explanation of the R Language.

Kindest Regards,RodConnect with me on LinkedIn.
chrisn-585491
chrisn-585491
SSC Eights!
SSC Eights! (956 reputation)SSC Eights! (956 reputation)SSC Eights! (956 reputation)SSC Eights! (956 reputation)SSC Eights! (956 reputation)SSC Eights! (956 reputation)SSC Eights! (956 reputation)SSC Eights! (956 reputation)

Group: General Forum Members
Points: 956 Visits: 2320
In terms of implementing (cutting-edge) stats, R has no peer.


The Python combo of pandas and numpy begs to disagree...
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search