hello nick and many thanks for this R tutorial
I have some absolute R beginners questions: Are all graphs generated with RStudio ? if so could they be embedded in SSRS or another reporting tool report (eg : microtrategy, tableau desktop...) ?
Some people in they comment indicate that using R libraries with SQL server version prior to 2016 version seems to be possible.
In fact what are the possibilities present in SQL Server 2016 not available in prior versions of SQL server regarding usage of R librairies ?
Many thanks for your feedback
Thanks for your questions, I will try my best to answer them. Yes, the graphs here were created using the GGPLOT library in R. This library has recently been recreated in Python as well. But no, they are not directly reproducible in SSRS or Tableau.
I have used R here because it is a statistical language and it makes exploratory analysis with robust statistics really easy to implement. Visualising data is a key part of exploratory analysis and being able to do this in R is great. But in terms of communicating results, there is no reason why you couldn't save the results (it is just data after all) to SQL Server and then use any tool (SSRS, Tableau, PowerBI, Excel...) to create charts using this data.
Tableau also offers R integration (http://www.tableau.com/new-features/r-integration). So you could do all of this within Tableau. I think this is interesting, as the integration of R requires not only good knowledge of Tableau and scripting but also of statistics. Therefore it is perhaps a step away from Tableau's "analytics for everybody" ethos. But it speaks strongly to the popularity and uniqueness of R that companies like Tableau, Microsoft and Oracle are investing heavily in the integration of R with their products.
With regards to the integration of R with SQL Server 2016, this offers significantly more functionality than many other products or solutions. You have always been able to use R to read data from a database using an ODBC connection. It doesn't matter whether your data source is an earlier version of SQL Server, MySQL, Oracle, Teradata... reading data from these sources into R is dead simple. However, you are then bound by the memory limits of your R workstation and typically, limited to non-parallel performance (of course R has libraries for parallel processing that you can use).
Revolution R have totally reversed this with their integration of R with databases, of which SQL Server 2016 is the latest. You can still read data from the database out to R, but you can also push the R-analysis from R into the database. This means you can run much larger analyses, over much larger datasets by leveraging the performance capabilities of the databases (e.g. parallelism and indexing). Perhaps even more exciting, you can embed R directly in stored procedures in SQL Server 2016. This means you could have all sorts of interesting analysis occurring in real-time as new data arrives (think things like anomaly detection, fraud detection, customer-based analytics...). This just wasn't possible in earlier versions of SQL Server.
Hope this helps,