Machine Learning Services on Database Server verses R Lang Execution on Laptop

  • We are running SQL Server 2017 (RTM) on Windows. The question came up, if we could install Machine Learning Services on our database server so a user(s) may execute Python and R scripts in the database? I am wondering if there would be a small or huge performance hit on our database server. (We try not to put other services/software/processing on our production database server.) Right now, we have one user who uses R Lang to access a group of tables in one of our user databases (not our main vendor application/database). Currently, he is loading the data from our SQL Server Database onto his laptop and running the R Lang scripts on his laptop. The user says everything is working fine, no performance issue on his laptop, but the user was wondering would there be any issues with him using SQL Server Machine Learning Services to execute Python and R scripts in the database where the data resides and eliminate the transfer of the data across the network to his laptop.

    I am just trying to get some thoughts, ideas, and opinions. Thanks in advance.

     

     

  • My opinion - I would install the R/Python stuff in SQL if you need it to store processed data in SQL OR you need it to process data that will be passed back to an application OR if you are needing a centralized location for an R script to be used by multiple people.  Kind of like SSRS or SSIS; if you don't NEED it on that instance of SQL, I wouldn't install it.

    My reason for this - it uses resources on the server.  R and Python I am pretty sure operate outside of SQL Server resources (memory and CPU).  So if someone tosses in a high memory load to R or Python, that is going to eat up resources on my server.

    Also, I would be asking them "why?".  What is their business case for installing it on the SQL side?  Sure it is possible, but they have indicated that there are no performance issues running it on their laptop, and it is for a single user.

    Now, eliminating the need to pull data from the database to their laptop is a good reason to put it on the SQL side, but it will mean that more resources are used on the database side.  It will have an impact.  It might not be a large impact, but in my opinion, making a change to a server for a single user seems like overkill.

    This is just my opinion.  My thoughts on SQL Server is that it is where my data resides.  SSIS and SSRS should reside on a different system than SQL Server (so as not to eat up resources, but comes at additional licensing costs so we currently don't have it set up that way) and same thing with R/Python.  I want the database to be as fast as it possibly can be without the overhead of extra tools; especially tools that are going to be used by only 1 user.

    I would also get them to add to the business case how much time they will save by having R on the SQL side.  Lets say they are processing 1,000 rows of data in R that are 1 KB each for a total of 1 MB of data.  If they are on a wired network connection on a LAN (ie no internet inbetween), they are looking at roughly 1 second to pull that data from the SQL side.  Is that really worth the time and effort to install and maintain R/Python in SQL?

     

    The above is just my opinion mind you.  It may be that once it is there, more people will use it.  OR it may be you spend the time to install it and the end user doesn't like the UI (SSMS/sqlcmd) needed for doing Python/R in SQL and will go back to how they did it before on their laptop...

  • Mr. Brian Gale, I just want to say thank you for such a detailed and thorough response to my post. The information you provided was very helpful and we have decided not to install Machine Learning Services on our database server.

     

Viewing 3 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply