SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Flying high on the Big Data hot-air


Flying high on the Big Data hot-air

Author
Message
krowley
krowley
Old Hand
Old Hand (358 reputation)Old Hand (358 reputation)Old Hand (358 reputation)Old Hand (358 reputation)Old Hand (358 reputation)Old Hand (358 reputation)Old Hand (358 reputation)Old Hand (358 reputation)

Group: General Forum Members
Points: 358 Visits: 429
Looking forward to the Stairway series. As an "accidental" DBA who loves this field I am always looking to learn more.
Rod
Rod
SSCertifiable
SSCertifiable (7.2K reputation)SSCertifiable (7.2K reputation)SSCertifiable (7.2K reputation)SSCertifiable (7.2K reputation)SSCertifiable (7.2K reputation)SSCertifiable (7.2K reputation)SSCertifiable (7.2K reputation)SSCertifiable (7.2K reputation)

Group: General Forum Members
Points: 7156 Visits: 2179
Interesting article. I've got one question, which I ask out of my ignorance, what is R?

Kindest Regards,Rod
Connect with me on LinkedIn.
PHYData DBA
PHYData DBA
SSCrazy
SSCrazy (2.1K reputation)SSCrazy (2.1K reputation)SSCrazy (2.1K reputation)SSCrazy (2.1K reputation)SSCrazy (2.1K reputation)SSCrazy (2.1K reputation)SSCrazy (2.1K reputation)SSCrazy (2.1K reputation)

Group: General Forum Members
Points: 2063 Visits: 537
I am not a huge fan of random big data projects for useless over caffeinated marketing ideals.

I do like projects like a certain Health Industry project started in 2006 that has become invaluable and very relevant with our current reforms.
http://www.advisory.com/Technology/Crimson

I would dare to think that other industries, consumers, and regulators could find a use for similar collections of data and it's analysis. Cool

Anything worth doing is worth doing right since most things done wrong are worthless.
PHYData DBA
PHYData DBA
SSCrazy
SSCrazy (2.1K reputation)SSCrazy (2.1K reputation)SSCrazy (2.1K reputation)SSCrazy (2.1K reputation)SSCrazy (2.1K reputation)SSCrazy (2.1K reputation)SSCrazy (2.1K reputation)SSCrazy (2.1K reputation)

Group: General Forum Members
Points: 2063 Visits: 537
Phil Factor (7/30/2013)
@wim.bekkens

Thanks for that.
For a simple example, take a look at this series that is now coming out on Simple-talk. It walks you through an example application that involves using R to report KPIs in a SQL Server database.
Creating a Business Intelligence Dashboard with R and ASP.NET MVC: Part 1
Creating a Business Intelligence Dashboard with R and ASP.NET MVC: Part 2


Phill,
R was cool but unless you are working at the NIS isn't it dated?

When compared to some of the newer and highly maintained Graphical Stat Display tools such as the free Sigma Plot MySystat http://www.systat.com/MystatProducts.aspx R feels like an amber screen from the 80's.

Don’t get me wrong. It was awesome and I used it.
Now there are IMHO better tools that require less heavy lifting.

If anyone is interested in a free large scale Database solution used by Netflix, Twitter, eBay, reddit, Cisco, etc... http://cassandra.apache.org/ Cool
Phil Factor
Phil Factor
SSCarpal Tunnel
SSCarpal Tunnel (4.9K reputation)SSCarpal Tunnel (4.9K reputation)SSCarpal Tunnel (4.9K reputation)SSCarpal Tunnel (4.9K reputation)SSCarpal Tunnel (4.9K reputation)SSCarpal Tunnel (4.9K reputation)SSCarpal Tunnel (4.9K reputation)SSCarpal Tunnel (4.9K reputation)

Group: General Forum Members
Points: 4888 Visits: 3031
@Rod At Work

The R programming language is a GNU open-source project designed for statistical analysis of data. It supports a huge number of extensions in user-created packages that give it a bewildering versatility.
See http://en.wikipedia.org/wiki/R_programming_language
Although I can sympathise with PHYData DBA I reckon that it is important to take R seriously, whatever else you use as well. Not only is it valuable purely for analysis of variance and factor analysis but it is widely used in universities for the teaching of parametric statistics. There are a huge number of resources, Books, samples and videos around. It isn't going away!


Best wishes,

Phil Factor
Simple Talk
Miles Neale
Miles Neale
SSCarpal Tunnel
SSCarpal Tunnel (4.2K reputation)SSCarpal Tunnel (4.2K reputation)SSCarpal Tunnel (4.2K reputation)SSCarpal Tunnel (4.2K reputation)SSCarpal Tunnel (4.2K reputation)SSCarpal Tunnel (4.2K reputation)SSCarpal Tunnel (4.2K reputation)SSCarpal Tunnel (4.2K reputation)

Group: General Forum Members
Points: 4246 Visits: 1695
Hi Phil,

I look forward to the replay of this editorial in 5 years and the reaction others have to it then.

As to the comment now, you say that "For years, we in the database industry have struggled..."
and this may well continue as long as we try to limit data strictly to a database. Data has and will continue to be both the combination of databases and other data. As you state, we have known that for a considerable amount of time.

Also you indirectly state that we have had everything we need to deal with Big Data for some time. Note that with R and SQL Server we lack only two things. If we can get it all using R and SQL Server we must be able to use then to search all data generated by all users on all platforms, all email, memos, white papers, websites, and generated reports. And we have had that capability for some time?

I know that there is hype, and I agree that we should point and advise, and we really need to get on board and get all we can out of this current wave. There we have complete agreement. But we also need to look very closely with what we have and determine if the tools we used yesterday are able to manage horizontally scaled data correctly and with a large enough sample to do basic analysis as well as meet the demands of investigations where every element of certain criteria is require to be presented. If the tools of yesterday cannot do it and those being developed today will also fall short, it might be good to get involved as you have in trying to scope and define the tools of the future.

Thanks for getting this on the table, good to see and hear.

Not all gray hairs are Dinosaurs!
RobertYoung
RobertYoung
SSC Veteran
SSC Veteran (246 reputation)SSC Veteran (246 reputation)SSC Veteran (246 reputation)SSC Veteran (246 reputation)SSC Veteran (246 reputation)SSC Veteran (246 reputation)SSC Veteran (246 reputation)SSC Veteran (246 reputation)

Group: General Forum Members
Points: 246 Visits: 232
PHYData DBA (7/30/2013)
Phil Factor (7/30/2013)
@wim.bekkens

Thanks for that.
For a simple example, take a look at this series that is now coming out on Simple-talk. It walks you through an example application that involves using R to report KPIs in a SQL Server database.
Creating a Business Intelligence Dashboard with R and ASP.NET MVC: Part 1
Creating a Business Intelligence Dashboard with R and ASP.NET MVC: Part 2


Phill,
R was cool but unless you are working at the NIS isn't it dated?

When compared to some of the newer and highly maintained Graphical Stat Display tools such as the free Sigma Plot MySystat http://www.systat.com/MystatProducts.aspx R feels like an amber screen from the 80's.

Don’t get me wrong. It was awesome and I used it.
Now there are IMHO better tools that require less heavy lifting.

If anyone is interested in a free large scale Database solution used by Netflix, Twitter, eBay, reddit, Cisco, etc... http://cassandra.apache.org/ Cool


<sermon>
For the Point and Click crowd, I suppose SAS/SPSS/MiniTab/Excel and the like feel "modern". In terms of implementing (cutting-edge) stats, R has no peer. And, it's taking market share from all of the closed-source alternatives. As I mentioned earlier, Oracle has followed in the footsteps of Postgres with its integration strategy. RStudio does a good job of integrating the necessary bits of R. No Point&Click as yet, however. And not likely, either. The momentum in R is toward R as programming language, rather than R as stat command language. Julia is the current front runner. As such, integrating into the sql/database engine, rather than variations on ODBC from RStudio/etc. is the way forward.
</sermon>
PHYData DBA
PHYData DBA
SSCrazy
SSCrazy (2.1K reputation)SSCrazy (2.1K reputation)SSCrazy (2.1K reputation)SSCrazy (2.1K reputation)SSCrazy (2.1K reputation)SSCrazy (2.1K reputation)SSCrazy (2.1K reputation)SSCrazy (2.1K reputation)

Group: General Forum Members
Points: 2063 Visits: 537
RobertYoung (7/30/2013)
PHYData DBA (7/30/2013)
Phil Factor (7/30/2013)
@wim.bekkens

Thanks for that.
For a simple example, take a look at this series that is now coming out on Simple-talk. It walks you through an example application that involves using R to report KPIs in a SQL Server database.
Creating a Business Intelligence Dashboard with R and ASP.NET MVC: Part 1
Creating a Business Intelligence Dashboard with R and ASP.NET MVC: Part 2


Phill,
R was cool but unless you are working at the NIS isn't it dated?

When compared to some of the newer and highly maintained Graphical Stat Display tools such as the free Sigma Plot MySystat http://www.systat.com/MystatProducts.aspx R feels like an amber screen from the 80's.

Don’t get me wrong. It was awesome and I used it.
Now there are IMHO better tools that require less heavy lifting.

If anyone is interested in a free large scale Database solution used by Netflix, Twitter, eBay, reddit, Cisco, etc... http://cassandra.apache.org/ Cool


<sermon>
For the Point and Click crowd, I suppose SAS/SPSS/MiniTab/Excel and the like feel "modern". In terms of implementing (cutting-edge) stats, R has no peer. And, it's taking market share from all of the closed-source alternatives. As I mentioned earlier, Oracle has followed in the footsteps of Postgres with its integration strategy. RStudio does a good job of integrating the necessary bits of R. No Point&Click as yet, however. And not likely, either. The momentum in R is toward R as programming language, rather than R as stat command language. Julia is the current front runner. As such, integrating into the sql/database engine, rather than variations on ODBC from RStudio/etc. is the way forward.
</sermon>

I agree with a lot of what you say...
edit -- I have to admit that R has come a long way. I have done some reading about its latest advances as it's own programming language. Currently the creator of S, John Chambers, is working on the R team.
http://en.wikipedia.org/wiki/R_(programming_language)

I like SciLab for a GNU compatible library. It is based of off Matlab syntax so code reuse or porting from MatLab makes things easier. It is also very current and highly maintained.

Julia does have great promise. The fact that it is a true AST makes it even better. Hopefully it becomes embraced and adopted soon as well as S, R, Matlab, C, and Fortran. http://stats.stackexchange.com/questions/25672/does-julia-have-any-hope-of-sticking-in-the-statistical-community
Rod
Rod
SSCertifiable
SSCertifiable (7.2K reputation)SSCertifiable (7.2K reputation)SSCertifiable (7.2K reputation)SSCertifiable (7.2K reputation)SSCertifiable (7.2K reputation)SSCertifiable (7.2K reputation)SSCertifiable (7.2K reputation)SSCertifiable (7.2K reputation)

Group: General Forum Members
Points: 7156 Visits: 2179
Thank you, @Phil Factor, for the explanation of the R Language.

Kindest Regards,Rod
Connect with me on LinkedIn.
chrisn-585491
chrisn-585491
Hall of Fame
Hall of Fame (3.9K reputation)Hall of Fame (3.9K reputation)Hall of Fame (3.9K reputation)Hall of Fame (3.9K reputation)Hall of Fame (3.9K reputation)Hall of Fame (3.9K reputation)Hall of Fame (3.9K reputation)Hall of Fame (3.9K reputation)

Group: General Forum Members
Points: 3916 Visits: 2564
In terms of implementing (cutting-edge) stats, R has no peer.


The Python combo of pandas and numpy begs to disagree...
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search