Using the Statistical Program R for Running Statistics

  • Comments posted to this topic are about the item Using the Statistical Program R for Running Statistics

    Tomaž Kaštrun | twitter: @tomaz_tsql | Github: https://github.com/tomaztk | blog:  https://tomaztsql.wordpress.com/

  • I have only scanned the article briefly and I would like say that please check the security implications of enabling xp_cmdshell with whoever is responsible for the the security of your IT department, it is a very powerful command and a lot of people won't want it enabled.

    Regards,

    Stefan.

  • Nice article. Very useful for data profiling, amongst other things.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • Great article Tomaz!

    I've been doing a lot of work with R and I knew that theoretically it was possible to embed R in SQLServer but I had no clue how to. I'll be adding a link to this article on my intro to R - great resource, thanks 🙂

  • We had looked for some similar way to do this and ended up implementing a R service running in a polling loop (2 sec delay) reading requests from a SQL table, executing the script in the request, and returning the output via another table.

    In addition, our implementation allows multiple R services on multiple systems to process requests from multiple SQL Server tasks (if we needed to do so.)

    We were a bit concerned about shelling out to a command prompt (security and performance). Still it is good to have this nicely detailed set of instructions.

    -----

    I also wanted to add that it seems very strange that R cannot be run within SQL Server due to incompatibilities with the embedded .NET libraries and the open source rserve program. PostgreSQL has a very nice way to call R from within a user function. I may sound uncharitable but I wonder if this isn't on purpose to encourage Windows users to use SSAS....

  • Great article. Thanks for writing it.

  • Thank you Tomaz for this article.

    stefan.lipiec (3/10/2014)


    I have only scanned the article briefly and I would like say that please check the security implications of enabling xp_cmdshell with whoever is responsible for the the security of your IT department, it is a very powerful command and a lot of people won't want it enabled.

    Regards,

    Stefan.

    For this reason, I limit the usage to desktop R and ODBC R/W connections to the Database servers.

    😎

  • How do you limit the usage?

  • wallace.dave (3/10/2014)


    How do you limit the usage?

    Hi,

    the usage would be restricted to running R from a desktop using ODBC connections and Windows authentication. The users can access the same data using R as SSMS.

    EE :Whistling:

  • PostgreSQL has a very nice way to call R from within a user function. I may sound uncharitable but I wonder if this isn't on purpose to encourage Windows users to use SSAS....

    I've used PL/R, and most of the other industrial strength (i.e. commercial) databases manage to embed access to R at the engine level. With PG + PL/R it's accomplished because PG supports used defined functions using C/C++, and PL/R actually is in C, so while the language attribute is "R", it's really C.

    One of the MS sites says this, however:

    "CLR functions can be used to access native (unmanaged) code, such as code written in C or C++, via the use of PInvoke from managed code"

    So, one could write a PL/R style hook for SQL Server?? Seems odd that no one has. (The same applies to DB2, by the way.) The "you can't touch Open Source without giving away *your* program" baloney hasn't stopped Oracle or SAP.

  • Thank you for the post!

    I am having a problem when executing the SP:

    This works fine:

    EXECUTE [master].[dbo].XP_FILEEXIST 'C:\Archivos de programa\'

    But this returns 'not a directory':

    EXECUTE [master].[dbo].XP_FILEEXIST 'C:\Archivos de programa\R\'

    I have googled the problem, and it seems to be related to the SQL account permissions on the directory. I can't find how to set these permissions, I don't even know which is the SQL account...

    Can anybody help?

    Thanks!

  • athibaud (3/10/2014)


    Thank you for the post!

    I am having a problem when executing the SP:

    This works fine:

    EXECUTE [master].[dbo].XP_FILEEXIST 'C:\Archivos de programa\'

    But this returns 'not a directory':

    EXECUTE [master].[dbo].XP_FILEEXIST 'C:\Archivos de programa\R\'

    I have googled the problem, and it seems to be related to the SQL account permissions on the directory. I can't find how to set these permissions, I don't even know which is the SQL account...

    Can anybody help?

    Thanks!

    Start up SQL Server Configuration Manager, go to services and check which account the SQL Server service is using.

    Then give that account appropriate permissions on the folder.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • Thank you for the article. It is nice to know that this tool is available and it may be useful in some cases. However, I think Excel would be a better tool to use in most cases.

  • Only if Excel is a hammer and most of your problems are nails.

    Serious statistical processing is really outside of the realm of the Excel desktop application. You may want to look at some of the exchanges on the webs. Just a random post: http://www.michaelmilton.net/2010/01/26/when-to-use-excel-when-to-use-r/

    The few times I've tried to automate Excel to solve some hairy financial problems have left me gasping for a jack-hammer. Get some more tools in your belt - you'll appreciate them!

  • ashende (3/10/2014)


    Thank you for the article. It is nice to know that this tool is available and it may be useful in some cases. However, I think Excel would be a better tool to use in most cases.

    Although graphing is really powerful in Excel (if you know how and look past the defaults), R can plot complex statistical graphs in just a few lines.

    They are both different things btw: Excel is a spreadsheat (with many many additional features and plugins), and R is a language.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

Viewing 15 posts - 1 through 15 (of 21 total)

You must be logged in to reply to this topic. Login to reply