

SSCarpal Tunnel
Group: General Forum Members
Last Login: Friday, September 26, 2014 9:28 AM
Points: 4,362,
Visits: 6,204


mzak (7/21/2010)
Kevin Rathgeber (7/21/2010)
mzak (7/21/2010)
Kevin Rathgeber (7/21/2010) I have a comment regarding your TSQL
In trying to keep all things equal, Shouldn't you declare a variable at the beginning of your function to hold the value of PI() / 180 instead of calculating it 4 times.
That was the first item I noticed.Yep! The point of the article wasn't to pit SQL CLR against TSQL implementations (I debated whether to include the TSQL results with it's current implementation in the article). Rather, it was to show that you don't have to use a sql data type in your SQL CLR (you can use native .net data types) and that doing so may result in performance gains  in my particular example, rather significant performance gains. I figured Jeff would mention this shortly after I posted the requested TSQL so I was anticipating it . Good catch Kevin. It should also be stated that this wouldn't make 'all things equal' per se. The SQL CLR implementation doesn't perform the calculation once, let alone 4 times with each call. By declaring a variable to eliminate the TSQL redundancy, it still must be performed once with each call. Oops....posted without saying anything. Good point that it wouldn't be equal, just a little more equal Sorry to detract from your article, but in case you were interested, I just did some testing creating two functions. The first function calculates "PI() / 180" 4 times (your TSQL example). The 2nd function assigns "PI() / 180" to a variable and then uses that variable 4 times (my suggestion). Surpisingly the 2nd one (my suggestion) was slower. The amount it was slower is pretty much insignificant.....by about 4/10ths of a second over 1million iterations. Long story short, ignore my comment That's interesting. Although I wonder what the "cost" of declaring a variable is with respect to this scenario? If I get some time, I'll rewrite the TSQL udf based on your suggestion and see what results. Based on your test, I'd also be curious if changing it to eliminate some variable declarations would have much impact (remove @dDistance, @EarthRadius, etc). Also, there's no reason I couldn't assign a reasonable constant for pi() / 180.0 which essentially would get us to equality .
I too was wondering about the need (or more correctly the lack of need) for all those variables in the tsql code. Just inline everything you can and see what happens. I also wonder if things can REALLY be sped up if you can make the entire compuation a single statement (even computing the pi/180 thing every time to ensure NO variable declarations exist). If you can get to that, then try simply inlining that statement in your SELECT and avoiding the UDF altogether and see if that is even faster.
Best,
Kevin G. Boles SQL Server Consultant SQL MVP 20072012 TheSQLGuru at GMail




SSC Eights!
Group: General Forum Members
Last Login: Thursday, February 6, 2014 12:59 PM
Points: 801,
Visits: 1,962


If you do inline everything then contrast the execution plan with using a variable.
ATB
Charles Kincaid




Grasshopper
Group: General Forum Members
Last Login: Thursday, April 14, 2011 7:04 AM
Points: 20,
Visits: 177


Some interesting results!
I first tested the suggested hypothesis of declaring/setting the pi/180 as a constant in the TSQL udf.
Average time = 1444.4 milliseconds
Then I tested using no declared variables within the TSQL udf.
Average time = 1456.3 milliseconds
Finally, I tested no udf call at all (inline expression).
Average time = 664.4 milliseconds!
Of course, the inline expression would be a nightmare to maintain so there are some tradeoffs.




Forum Newbie
Group: General Forum Members
Last Login: Thursday, January 16, 2014 6:03 AM
Points: 6,
Visits: 33


It would be very interesting to see the differences using the profiler. In a demo I saw recently the client statistics showed a performance gain but when they ran profiler to compare the tsql and clr, there was a lot higher cpu utilization on the server.
I think you are really onto something with converting the data types to .net, excellent article! I learned a lot.




SSC Rookie
Group: General Forum Members
Last Login: Tuesday, January 14, 2014 10:21 AM
Points: 36,
Visits: 152


mzak (7/22/2010) Some interesting results!
I first tested the suggested hypothesis of declaring/setting the pi/180 as a constant in the TSQL udf.
Average time = 1444.4 milliseconds
Then I tested using no declared variables within the TSQL udf.
Average time = 1456.3 milliseconds
Finally, I tested no udf call at all (inline expression).
Average time = 664.4 milliseconds!
Of course, the inline expression would be a nightmare to maintain so there are some tradeoffs.
Now that is something that one fights with in his head. I am an application developer (GIS and Web based) and I must admit my raw SQL skills have not been keeping up. My head is more stuck in the .Net Object Oriented world and because of this I tend to want to keep things clean and readable and create reuse. That was one of my main reasons for suggesting the variable to hold the PI()/ 180 value.......for reuse.
The second reason is I seriously thought that caculating once and referencing the variable over again would create a performance improvement. In my tests I did not see any and in fact the variable declare was slightly worse but minimal. In the slightly worse but minimal scenario, I would opt for readability over performance. In the case you just showed above, doing everything inline, I would have to go the other way and as a developer this bugs the hell out of me lol.
EDIT: Just want to clarify  I am not saying that you can't keep things clean/readable/reusable in SQL.....the fact that it is more top down vs object oriented makes it a less cleaner/readable by nature. Because of this I find myself not caring so much about readability as I do about performance at this tier.




Ten Centuries
Group: General Forum Members
Last Login: Wednesday, September 24, 2014 1:20 PM
Points: 1,276,
Visits: 1,135


You should be able to access the SqlDecimal values directly using the .Value property of the SqlDecimal variables. The reason people recommend using SqlDecimal instead of "native" .NET Decimal is simple: NULL. Try passing a NULL into your functions; without using the nullable types your function won't be able to handle them.
My advice would be to try this:
* Use SqlDecimal for your parameters to your function * First check for NULLs and handle appropriately, first thing in the function * Then assign your SqlDecimal parameter values to local Decimal variables * Do all your calculations using the nonnullable (standard) .NET types like Decimal * Return your result by creating a new SqlDecimal instance (i.e., return new SqlDecimal(3956.0 * c))
I wouldn't use standard nonnullable .NET types for parameters and return values, but I also wouldn't use the nullable types for calculations internal to the function. Using the above I'd expect your testing time to be somewhat more than your best time, but significantly less than your worst time.
Mike C




Ten Centuries
Group: General Forum Members
Last Login: Wednesday, September 24, 2014 1:20 PM
Points: 1,276,
Visits: 1,135


Oh yeah, almost forgot to mention, your performance difference between the version with global constants and the one with no global constants is probably due to implicit conversions from Decimal and/or SqlDecimal to the .NET Double type.
That is to say when you define Decimal constants like EarthRadiusInMiles = 3956.0, .NET implicitly converts 3956.0 from Double to Decimal for you. If you define it at the top, it only has to do the implicit conversion once. If you just use 3956.0 in the code in a calculation the implicit conversion would have to take place for every reference. I tried to modify your code, as shown below, to do all internal calculations using the Double data type and convert from/to SqlDecimal only at the beginning and end of the calculation. Please forgive if I have any typos below, I usually write SQL and C#.
' Code begins here Imports System.Data.SqlTypes Imports Microsoft.SqlServer.Server Imports System.Math
Partial Public Class UserDefinedFunctions Public Const EarthRadiusInMiles As Double = 3956.0 Public Const RadianConversionConst As Double = Math.PI / 180.0 <SqlFunction()> _ Public Shared Function Distance( _ ByVal Latitude1 As SqlDecimal, _ ByVal Longitude1 As SqlDecimal, _ ByVal Latitude2 As SqlDecimal, _ ByVal Longitude2 As SqlDecimal) As SqlDecimal ' Return NULL automatically if any nulls are passed in ' Use OrElse for shortcircuit logic If (Latitude1.IsNull OrElse Latitude2.IsNull OrElse Longitude1.IsNull OrElse Longitude2.IsNull) Then Return SqlDecimal.Null End If ' Explicitly cast all inputs to Double and do all calculations as Doubles Dim Lat1 As Double = CType(Latitude1.Value, Double) Dim Long1 As Double = CType(Longitude1.Value, Double) Dim Lat2 As Double = CType(Latitude2.Value, Double) Dim Long2 As Double = CType(Longitude2.Value, Double) Dim Latitude1Radian As Double = Lat1 * RadianConversionConst Dim Longitude1Radian As Double = Long1 * RadianConversionConst Dim Latitude2Radian As Double = Lat2 * RadianConversionConst Dim Longitude2Radian As Double = Long2 * RadianConversionConst Dim LongSpanRadian As Double = Longitude2Radian  Longitude1Radian Dim LatSpanRadian As Double = Latitude2Radian  Latitude1Radian 'intermediate result a Dim a As Double = Math.Pow(Math.Sin(LatSpanRadian / 2.0), 2) + Math.Cos(Latitude1Radian) * Math.Cos(Latitude2Radian) * Math.Pow(Math.Sin(LongSpanRadian / 2.0), 2) 'intermediate result c (great circle distance in radians) Dim c As Double = 2.0 * Math.Atan2(Math.Sqrt(a), Math.Sqrt(1.0  a)) Return New SqlDecimal(EarthRadiusInMiles * c) End Function End Class ' Code ends here
Mike C




Grasshopper
Group: General Forum Members
Last Login: Thursday, April 14, 2011 7:04 AM
Points: 20,
Visits: 177


Mike C (7/22/2010)
Oh yeah, almost forgot to mention, your performance difference between the version with global constants and the one with no global constants is probably due to implicit conversions from Decimal and/or SqlDecimal to the .NET Double type. That is to say when you define Decimal constants like EarthRadiusInMiles = 3956.0, .NET implicitly converts 3956.0 from Double to Decimal for you. If you define it at the top, it only has to do the implicit conversion once. If you just use 3956.0 in the code in a calculation the implicit conversion would have to take place for every reference. I tried to modify your code, as shown below, to do all internal calculations using the Double data type and convert from/to SqlDecimal only at the beginning and end of the calculation. Please forgive if I have any typos below, I usually write SQL and C#. ' Code begins here Imports System.Data.SqlTypes Imports Microsoft.SqlServer.Server Imports System.Math
Partial Public Class UserDefinedFunctions Public Const EarthRadiusInMiles As Double = 3956.0 Public Const RadianConversionConst As Double = Math.PI / 180.0 <SqlFunction()> _ Public Shared Function Distance( _ ByVal Latitude1 As SqlDecimal, _ ByVal Longitude1 As SqlDecimal, _ ByVal Latitude2 As SqlDecimal, _ ByVal Longitude2 As SqlDecimal) As SqlDecimal ' Return NULL automatically if any nulls are passed in ' Use OrElse for shortcircuit logic If (Latitude1.IsNull OrElse Latitude2.IsNull OrElse Longitude1.IsNull OrElse Longitude2.IsNull) Then Return SqlDecimal.Null End If ' Explicitly cast all inputs to Double and do all calculations as Doubles Dim Lat1 As Double = CType(Latitude1.Value, Double) Dim Long1 As Double = CType(Longitude1.Value, Double) Dim Lat2 As Double = CType(Latitude2.Value, Double) Dim Long2 As Double = CType(Longitude2.Value, Double) Dim Latitude1Radian As Double = Lat1 * RadianConversionConst Dim Longitude1Radian As Double = Long1 * RadianConversionConst Dim Latitude2Radian As Double = Lat2 * RadianConversionConst Dim Longitude2Radian As Double = Long2 * RadianConversionConst Dim LongSpanRadian As Double = Longitude2Radian  Longitude1Radian Dim LatSpanRadian As Double = Latitude2Radian  Latitude1Radian 'intermediate result a Dim a As Double = Math.Pow(Math.Sin(LatSpanRadian / 2.0), 2) + Math.Cos(Latitude1Radian) * Math.Cos(Latitude2Radian) * Math.Pow(Math.Sin(LongSpanRadian / 2.0), 2) 'intermediate result c (great circle distance in radians) Dim c As Double = 2.0 * Math.Atan2(Math.Sqrt(a), Math.Sqrt(1.0  a)) Return New SqlDecimal(EarthRadiusInMiles * c) End Function End Class ' Code ends here
Mike C
I was going to implement your suggestions and call it Distance_MikeC_Style!
Good catch on the null values. I shouldn't have any passed into this based on how it's currently implemented in our environment but the function should be able to handle it eloquently. It bothers me somewhat that I didn't think about this and provide for it  let alone have a disclaimer in the article in case wanderers miss this discussion. I'll see if that can be updated (but likely next week as I'm about gone for the day!)
I need to read up more on .net data types in reference to differences between double's and decimal's and admittedly should have done that before posting this. Thanks for sharing your knowledge with respect to this.
I appreciate the code representation of your suggestions. I will test this "Distance_MikeC_Style" and post the results!




Grasshopper
Group: General Forum Members
Last Login: Thursday, April 14, 2011 7:04 AM
Points: 20,
Visits: 177


As per Mike C's insight, the following is an update to the SQL CLR distance function using global constants where possible and handling the SqlDecimal > Double > SqlDecimal processing correctly.
Average execution time using the same 30 iteration test was still impressive at 784.9 milliseconds.
Thanks Mike. Much appreciated!
Imports System.Data.SqlTypes Imports Microsoft.SqlServer.Server
Partial Public Class UserDefinedFunctions
Public Const EarthRadiusInMiles As Double = 3956.0 Public Const RadianConversionConst As Double = Math.PI / 180.0
<SqlFunction()> _ Public Shared Function Distance_MikeC_Style( _ ByVal Latitude1 As SqlDecimal, _ ByVal Longitude1 As SqlDecimal, _ ByVal Latitude2 As SqlDecimal, _ ByVal Longitude2 As SqlDecimal) As SqlDecimal
' Return NULL automatically if any nulls are passed in ' Use OrElse for shortcircuit logic If (Latitude1.IsNull OrElse Latitude2.IsNull OrElse Longitude1.IsNull OrElse Longitude2.IsNull) Then Return SqlDecimal.Null End If ' Explicitly cast all inputs to Double and do all calculations as Doubles Dim Lat1 As Double = CType(Latitude1.Value, Double) Dim Long1 As Double = CType(Longitude1.Value, Double) Dim Lat2 As Double = CType(Latitude2.Value, Double) Dim Long2 As Double = CType(Longitude2.Value, Double)
Dim Latitude1Radian As Double = Lat1 * RadianConversionConst Dim Longitude1Radian As Double = Long1 * RadianConversionConst Dim Latitude2Radian As Double = Lat2 * RadianConversionConst Dim Longitude2Radian As Double = Long2 * RadianConversionConst
Dim LongSpanRadian As Double = Longitude2Radian  Longitude1Radian Dim LatSpanRadian As Double = Latitude2Radian  Latitude1Radian
'intermediate result a Dim a As Double = Math.Pow(Math.Sin(LatSpanRadian / 2.0), 2) + Math.Cos(Latitude1Radian) * Math.Cos(Latitude2Radian) * Math.Pow(Math.Sin(LongSpanRadian / 2.0), 2)
'intermediate result c (great circle distance in radians) Dim c As Double = 2.0 * Math.Atan2(Math.Sqrt(a), Math.Sqrt(1.0  a))
Return New SqlDecimal(EarthRadiusInMiles * c) End Function




Grasshopper
Group: General Forum Members
Last Login: Thursday, April 14, 2011 7:04 AM
Points: 20,
Visits: 177


Incidently, I tested the "Mike C style" function with minimal variable dimensioning (basically a hybrid of what we were trying on the TSQL side based on the Kevins' postings).
It results in a very long return new sqldecimal( ... ) line!
Average execution time over 30 iterations was slightly better than non"minimal variable dimensioning" at 777.6 milliseconds.
Again, I'd rather have more maintainable code than such a subtle performance improvement but in some instances, that may not be a luxury!



