SQLServerCentral Article

SQL CLR Data Types and Performance

,

This article doesn't introduce anything new with respect to development instead but highlights the effects of optimizing code for "mass executions" as is likely typical with a SQL CLR deployment.

I'm just now getting to a point where I can start playing with the SQL CLR capabilities of SQL Server 2005 / 2008 / 2008 R2. I started out with what I determined would be the best possible scenario we (my company) stand to gain from the availability of SQL CLR - a distance calculation user-defined function. The T-SQL implementation is not overly complicated or messy, but given it is purely a computational scalar function and based on what I've read online regarding when to use the SQL CLR, I saw this as a good candidate.

My first implementation of the Distance user-defined SQL CLR function looked similar to this:

<SqlFunction()> _
Public Shared Function Distance( _
 ByVal Latitude1 As SqlDecimal, _
 ByVal Longitude1 As SqlDecimal, _
 ByVal Latitude2 As SqlDecimal, _
 ByVal Longitude2 As SqlDecimal) As SqlDecimal
Dim Latitude1Radian As SqlDecimal = Latitude1 * Math.PI / 180.0
 Dim Longitude1Radian As SqlDecimal = Longitude1 * Math.PI / 180.0
 Dim Latitude2Radian As SqlDecimal = Latitude2 * Math.PI / 180.0
 Dim Longitude2Radian As SqlDecimal = Longitude2 * Math.PI / 180.0
Dim LongSpanRadian As SqlDecimal = Longitude2Radian - Longitude1Radian
 Dim LatSpanRadian As SqlDecimal = Latitude2Radian - Latitude1Radian
'intermediate result a
 Dim a As SqlDecimal = Math.Pow(Math.Sin(LatSpanRadian / 2.0), 2) + Math.Cos(Latitude1Radian) * Math.Cos(Latitude2Radian) * Math.Pow(Math.Sin(LongSpanRadian / 2.0), 2)
'intermediate result c (great circle distance in radians)
 Dim c As SqlDecimal = 2.0 * Math.Atan2(Math.Sqrt(a), Math.Sqrt(1.0 - a))
Return 3956.0 * c
End Function

How did it perform? Not too bad. I calculated (in a very unofficial fashion) about an 11% reduction in time to execute a query compared to the pure T-SQL implementation. But based on reports I've seen regarding the benefits of SQL CLR, I must say I was disappointed in such a marginal improvement.

As such, I set about trying some different methods to attempt to optimize the function.

Notice the prevalence of SqlDecimal data types. I used them (thinking I had to) based on examples I've found on the Internet. Then I saw an example that used Decimal (a native .net data type) instead and I thought I'd try that to see if there were any differences. I also thought I could eliminate redundant operations by creating some public constants.

The resulting block of code (shown below) was my next iteration of the Distance user-defined function using SQL CLR.

Imports System.Data.SqlTypes
Imports Microsoft.SqlServer.Server
Partial Public Class UserDefinedFunctions
Public Const EarthRadiusInMiles As Decimal = 3956.0
 Public Const RadianConversionConst As Decimal = Math.PI / 180.0
<SqlFunction()> _
 Public Shared Function Distance( _
 ByVal Latitude1 As Decimal, _
 ByVal Longitude1 As Decimal, _
 ByVal Latitude2 As Decimal, _
 ByVal Longitude2 As Decimal) As Decimal
Dim Latitude1Radian As Decimal = Latitude1 * RadianConversionConst
 Dim Longitude1Radian As Decimal = Longitude1 * RadianConversionConst
 Dim Latitude2Radian As Decimal = Latitude2 * RadianConversionConst
 Dim Longitude2Radian As Decimal = Longitude2 * RadianConversionConst
Dim LongSpanRadian As Decimal = Longitude2Radian - Longitude1Radian
 Dim LatSpanRadian As Decimal = Latitude2Radian - Latitude1Radian
'intermediate result a
 Dim a As Decimal = Math.Pow(Math.Sin(LatSpanRadian / 2.0), 2) + Math.Cos(Latitude1Radian) * Math.Cos(Latitude2Radian) * Math.Pow(Math.Sin(LongSpanRadian / 2.0), 2)
'intermediate result c (great circle distance in radians)
 Dim c As Decimal = 2.0 * Math.Atan2(Math.Sqrt(a), Math.Sqrt(1.0 - a))
Return EarthRadiusInMiles * c
 End Function
End Class

The result was an (unofficial) 60% reduction in execution time! And it makes sense. The SqlDecimal would have to be converted to the associated native data type when passed into the various methods (Math.Cos, Math.Pow, Math.Sin, etc). That's a lot of conversions and added overhead for no substantial gain.

I then decided to write a simple test template and get some more accurate results of these differences and how performance is impacted. I created the four following methods within my SQL CLR library file:

  • DistanceSqlTypesNoGlobalConsts
  • DistanceSqlTypesGlobalConsts
  • DistanceNoGlobalConsts
  • DistanceGlobalConsts

I then created 4 separate 'stub' user-defined functions that called the associated assembly method and executed each one 30 times using a table of 100,000 latitude/longitude values and the following loop (showing the execution of one of the test functions - DistanceGlobalConsts).

DECLARE @counter int, @StartTime datetime, @EndTime datetime, @ElapsedMilliseconds int
DECLARE @ResultsTable table (iteration int not null, elapsedtime int not null)
SET @counter = 30
WHILE @counter > 0
BEGIN
 -- Clear cached data
 CHECKPOINT;
 DBCC DROPCLEANBUFFERS;
 DBCC FREEPROCCACHE;
 SELECT @StartTime = GETDATE();
 SELECT dbo.DistanceGlobalConsts(40.82000, -73.05000, Latitude, Longitude)
 FROM TestTable
 SELECT @EndTime = GETDATE();
 SELECT @ElapsedMilliseconds = DATEDIFF(ms, @StartTime, @EndTime);
 INSERT INTO @ResultsTable (iteration, elapsedtime)
 VALUES (@counter, @ElapsedMilliseconds)
 SET @counter = @counter - 1;
END
SELECT *
FROM @ResultsTable

The full results are attached for your review but the condensed results are presented in the table below.

Summary of Results
MethodAverage Time (milliseconds)
DistanceSqlTypesNoGlobalConsts
1821.766667
DistanceSqlTypesGlobalConsts
1283.966667
DistanceNoGlobalConsts
756.866667
DistanceGlobalConsts *
734.433333
T-SQL
1403.933333

* Note: I ran the DistanceGlobalConsts test case first and the first iteration of the 30 iteration test was about 200 milliseconds longer than the other iterations. I did not notice the same discrepancy on my other test cases so I re-processed the DistanceGlobalConsts and saw the first iteration was consistent with the other 29 iterations. I suspect the difference may have been a result of loading the assembly since it was the initial call to the assembly since it was created. Since all methods were in the same assembly, I think it is more consistent to use the re-processed results for DistanceGlobalConsts. Both tests are included in the full results attachment.

I like to think the hardware is immaterial for the purposes of this test (all tests were run on the same hardware in the same configuration/state); however, if you are interested in me posting specifics about the hardware used in this test, please feel free to request the details you are interested in. Also, I haven't considered any other performance metrics (disk io or memory consumption, in particular) so those should be considered as well.

There may be more that can be done to optimize this particular user-defined function but the general lack of information outlining the significance of using native .net data types versus the sqltypes within the SQL CLR assembly prompted me to share my discovery as I embark on playing with the SQL CLR.

Resources

Rate

4 (30)

You rated this post out of 5. Change rating

Share

Share

Rate

4 (30)

You rated this post out of 5. Change rating