Using the Statistical Program R for Running Statistics

  • tomaz.kastrun

    SSCrazy

    Points: 2085

    Comments posted to this topic are about the item Using the Statistical Program R for Running Statistics

    Tomaž Kaštrun | twitter: @tomaz_tsql | blog:  https://tomaztsql.wordpress.com/

  • stefan.lipiec

    Old Hand

    Points: 326

    I have only scanned the article briefly and I would like say that please check the security implications of enabling xp_cmdshell with whoever is responsible for the the security of your IT department, it is a very powerful command and a lot of people won't want it enabled.

    Regards,

    Stefan.

  • Koen Verbeeck

    SSC Guru

    Points: 258955

    Nice article. Very useful for data profiling, amongst other things.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • Steph Locke

    SSCrazy

    Points: 2857

    Great article Tomaz!

    I've been doing a lot of work with R and I knew that theoretically it was possible to embed R in SQLServer but I had no clue how to. I'll be adding a link to this article on my intro to R - great resource, thanks 🙂

  • rp-sqlsc

    SSC Veteran

    Points: 203

    We had looked for some similar way to do this and ended up implementing a R service running in a polling loop (2 sec delay) reading requests from a SQL table, executing the script in the request, and returning the output via another table.

    In addition, our implementation allows multiple R services on multiple systems to process requests from multiple SQL Server tasks (if we needed to do so.)

    We were a bit concerned about shelling out to a command prompt (security and performance). Still it is good to have this nicely detailed set of instructions.

    -----

    I also wanted to add that it seems very strange that R cannot be run within SQL Server due to incompatibilities with the embedded .NET libraries and the open source rserve program. PostgreSQL has a very nice way to call R from within a user function. I may sound uncharitable but I wonder if this isn't on purpose to encourage Windows users to use SSAS....

  • Stan Kulp-439977

    SSCrazy Eights

    Points: 9948

    Great article. Thanks for writing it.

  • Eirikur Eiriksson

    SSC Guru

    Points: 182367

    Thank you Tomaz for this article.

    stefan.lipiec (3/10/2014)


    I have only scanned the article briefly and I would like say that please check the security implications of enabling xp_cmdshell with whoever is responsible for the the security of your IT department, it is a very powerful command and a lot of people won't want it enabled.

    Regards,

    Stefan.

    For this reason, I limit the usage to desktop R and ODBC R/W connections to the Database servers.

    😎

  • wallace.dave

    Valued Member

    Points: 56

    How do you limit the usage?

  • Eirikur Eiriksson

    SSC Guru

    Points: 182367

    wallace.dave (3/10/2014)


    How do you limit the usage?

    Hi,

    the usage would be restricted to running R from a desktop using ODBC connections and Windows authentication. The users can access the same data using R as SSMS.

    EE :Whistling:

  • RobertYoung

    Ten Centuries

    Points: 1130

    PostgreSQL has a very nice way to call R from within a user function. I may sound uncharitable but I wonder if this isn't on purpose to encourage Windows users to use SSAS....

    I've used PL/R, and most of the other industrial strength (i.e. commercial) databases manage to embed access to R at the engine level. With PG + PL/R it's accomplished because PG supports used defined functions using C/C++, and PL/R actually is in C, so while the language attribute is "R", it's really C.

    One of the MS sites says this, however:

    "CLR functions can be used to access native (unmanaged) code, such as code written in C or C++, via the use of PInvoke from managed code"

    So, one could write a PL/R style hook for SQL Server?? Seems odd that no one has. (The same applies to DB2, by the way.) The "you can't touch Open Source without giving away *your* program" baloney hasn't stopped Oracle or SAP.

  • athibaud

    Grasshopper

    Points: 11

    Thank you for the post!

    I am having a problem when executing the SP:

    This works fine:

    EXECUTE [master].[dbo].XP_FILEEXIST 'C:\Archivos de programa\'

    But this returns 'not a directory':

    EXECUTE [master].[dbo].XP_FILEEXIST 'C:\Archivos de programa\R\'

    I have googled the problem, and it seems to be related to the SQL account permissions on the directory. I can't find how to set these permissions, I don't even know which is the SQL account...

    Can anybody help?

    Thanks!

  • Koen Verbeeck

    SSC Guru

    Points: 258955

    athibaud (3/10/2014)


    Thank you for the post!

    I am having a problem when executing the SP:

    This works fine:

    EXECUTE [master].[dbo].XP_FILEEXIST 'C:\Archivos de programa\'

    But this returns 'not a directory':

    EXECUTE [master].[dbo].XP_FILEEXIST 'C:\Archivos de programa\R\'

    I have googled the problem, and it seems to be related to the SQL account permissions on the directory. I can't find how to set these permissions, I don't even know which is the SQL account...

    Can anybody help?

    Thanks!

    Start up SQL Server Configuration Manager, go to services and check which account the SQL Server service is using.

    Then give that account appropriate permissions on the folder.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • ashende

    Valued Member

    Points: 53

    Thank you for the article. It is nice to know that this tool is available and it may be useful in some cases. However, I think Excel would be a better tool to use in most cases.

  • rp-sqlsc

    SSC Veteran

    Points: 203

    Only if Excel is a hammer and most of your problems are nails.

    Serious statistical processing is really outside of the realm of the Excel desktop application. You may want to look at some of the exchanges on the webs. Just a random post: http://www.michaelmilton.net/2010/01/26/when-to-use-excel-when-to-use-r/

    The few times I've tried to automate Excel to solve some hairy financial problems have left me gasping for a jack-hammer. Get some more tools in your belt - you'll appreciate them!

  • Koen Verbeeck

    SSC Guru

    Points: 258955

    ashende (3/10/2014)


    Thank you for the article. It is nice to know that this tool is available and it may be useful in some cases. However, I think Excel would be a better tool to use in most cases.

    Although graphing is really powerful in Excel (if you know how and look past the defaults), R can plot complex statistical graphs in just a few lines.

    They are both different things btw: Excel is a spreadsheat (with many many additional features and plugins), and R is a language.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

Viewing 15 posts - 1 through 15 (of 22 total)

You must be logged in to reply to this topic. Login to reply