• daniel.buskirk (1/27/2016)


    Thanks, Nick, for the informative article.

    I think that for jobs with small data loads, like analyzing SQL Server performance information, using good old-fashioned ODBC to bring the data into a workstation instance of R will remain the better choice for many users. As you mention, the primary value to R on the server is to eliminate the need for send large data volumes over the network.

    Some readers might be interested to know that the enterprise R node installed for the R server is built on the message-passing interface (MPI), a tried-and-true high performance computing protocol that has been around for a long time. For computationally intensive tasks it is possible to achieve parallel processing by distributing the R calculations to multiple nodes. Basically, the Revolution R system allows us to add "Big Math" to "Big Data".

    Dan Buskirk https://www.linkedin.com/in/sqlanalytics%5B/quote%5D

    Hi Dan,

    Thank you for your comments - and my apologies for taking so long to respond. I 100 % agree with you, when the data is a reasonable size and you have a one-off analysis to do then you are far better to pull it out of SQL Server. I don't think I am totally comfortable with the idea of running non-production, or non-essential analysis in a production database anyway. You could pin down the privileges, but this would still be a performance drain.

    But there are some applications when running analyses in-database would be benefitial, even on VERY small units of data. For example, online fraud detection, or real-time customer-behaviour modeling, or real-time performance monitoring. I think these are really really interesting areas!

    Cheers,

    Nick