Machine Learning in the Database

, 2018-06-12

When SQL Server added the ability to execute R code, the decision seemed to split the customer base into two groups. One group was impressed and thought the idea of executing R code to analyze data in the database was a good idea. They were excited and impressed by the loan classification demo. If you haven't read about this or seen the demo, it's very interesting, and it's something you might take a few minutes to read or watch it.

The other group of customers felt this was a poor use of CPU cycles for a very expensive SQL Server CPU license. Running a complex analysis, training models, and other functions commonly associated with R scripts aren't a good use of scarce resources. They would rather have R code execute on a separate server, much like any large messaging workload might be better served by a service such as AWS' Simple Queue Service rather than Service Broker.

I tend to be in the first group, as is Dr. Low. He writes that there is a place where Machine Learning Services (MLS), with both R and Python, are a good use of resources. Not in all cases, and certainly not for all work. The difficult parts of training models and doing the hard work of coming up with new ways to perform an analysis is definitely better left to workstations and data scientists. Those actions might not be worth the resources they take.

Once the models are trained, however, the executable load of submitting parameters to a model and getting a prediction is small. SQL Server allows us to load pre-trained models into the database and just call them as needed. Plus, the R models run in a multi-threaded fashion, unlike the single threaded execution in clients such as R Studio.

As with any feature of SQL Server, it's important to test and evaluate the real world impact of new code on production sized workloads. Not only will you want to measure the load of your model execution, but you should also measure any changes in your existing workload with the additional R or Python code load. While I wouldn't prevent the use of MLS in SQL Server, just like SQL CLR code, I would be careful about introducing without extensive testing, including dark deployments and simulated loads.


4 (1)




4 (1)

Related content


Will the next version of Windows be a "Mini-Me" version of Vista? Who knows, and it's too early to tell, but apparently there's a mini-kernel version of Windows 7, the one after Vista, which fits into 25MB on disk. That's a touch lower than the 4GB that Vista takes up. Granted it's not a full […]


60 reads

An Hour in Time

Daylight Savings time switches a little later this year. In fact it's November 4th this year, after having been in October for all of my life. In case you don't remember which way we move the clocks, here's a saying: Spring forward, fall back.

5 (1)


199 reads

Software is Like Building a House

One of the really classic analogies in software is that it's like building a house. You have a foundation, multiple teams, lots of contractors that specialize in something, etc. And it's an analogy that's debated as to its relevance over and over. I won't go into the correctness of this analogy, but I wanted to comment on it.

2012-10-08 (first published: )

293 reads