daniel.buskirk (1/27/2016)
I think that for jobs with small data loads, like analyzing SQL Server performance information, using good old-fashioned ODBC to bring the data into a workstation instance of R will remain the better choice for many users. As you mention, the primary value to R on the server is to eliminate the need for send large data volumes over the network.
Some readers might be interested to know that the enterprise R node installed for the R server is built on the message-passing interface (MPI), a tried-and-true high performance computing protocol that has been around for a long time. For computationally intensive tasks it is possible to achieve parallel processing by distributing the R calculations to multiple nodes. Basically, the Revolution R system allows us to add "Big Math" to "Big Data".
Dan Buskirk https://www.linkedin.com/in/sqlanalytics%5B/quote%5D
Hi Dan,
Thank you for your comments - and my apologies for taking so long to respond. I 100 % agree with you, when the data is a reasonable size and you have a one-off analysis to do then you are far better to pull it out of SQL Server. I don't think I am totally comfortable with the idea of running non-production, or non-essential analysis in a production database anyway. You could pin down the privileges, but this would still be a performance drain.
But there are some applications when running analyses in-database would be benefitial, even on VERY small units of data. For example, online fraud detection, or real-time customer-behaviour modeling, or real-time performance monitoring. I think these are really really interesting areas!
Cheers,
Nick