Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase ««1234»»»

SQL CLR Data Types and Performance Expand / Collapse
Author
Message
Posted Wednesday, July 21, 2010 11:49 AM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Thursday, January 24, 2013 12:02 PM
Points: 42, Visits: 152
wbrianwhite (7/21/2010)
Since you were using lat/long datatypes - did you look into the geometry data type?
http://msdn.microsoft.com/en-us/library/bb964711.aspx
http://social.msdn.microsoft.com/Forums/en-US/sqlspatial/thread/c557032a-038e-480c-84e8-7f0ad72e1af7

I've read about it a little, but haven't used it yet. Seems to be the way to go if you're going CLR.


Sorry, should have recommended geography, not geometry.
Post #956593
Posted Wednesday, July 21, 2010 11:59 AM


SSC Eights!

SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!

Group: General Forum Members
Last Login: Thursday, February 6, 2014 12:59 PM
Points: 801, Visits: 1,962
Hmm. Yes avoiding data type conversions is a good thing. I think that you did some other good things as well. Not repeating "Math.PI / 180.0" but using a precalculated constant is great. Plus you did that with some other non-changing elements as well. I'm then curious if your speed gains were due mostly to precalculation or data type conversion avoidance.

The CLR is based on the Dot Net framework and I have had quite a bit of experience with that. I have not gotten into the CLR yet as it is not available on the Compact Edition on the mobile side. Yet there are some things that us old procedural coders can pass along to help. I had an issue where I was looping though quite a set of items. For each one I had to call some functions that took a Long as a parameter. I saved a bit by using a Long as my loop counter and not having to convert for every function call. There was no way I was ever going to process that many items per run but using the right type saved the loop time overall.

It's good to point out this sort of thing as it's not obvious to new folk what is going on. The "less code is better code" is great but if your code calls stuff under the hood your small source can actually turn out tho be more code at run time. I suggest that you turn on both Option Explicit and Option Strict in your Visual Studio projects.

Good article.


ATB

Charles Kincaid

Post #956602
Posted Wednesday, July 21, 2010 12:32 PM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Thursday, April 14, 2011 7:04 AM
Points: 20, Visits: 177
Hi wbrianwhite,

I have looked at the geometry and geography data types / support in the later versions of SQL Server.

I don't feel that I'm familiar enough yet to comment conclusively one way or the other.

My initial tests with geography data types did not seem to perform near as well as the basic udf function outlined in the article. There may be a lot more than what "meets the eye" here though so further investigation will continue when time permits.

Post #956647
Posted Wednesday, July 21, 2010 12:56 PM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Thursday, April 14, 2011 7:04 AM
Points: 20, Visits: 177
Charles,

Great suggestions and my compliments to a 'procedural coder' pursuing 'set-based' material!

I think it's important to note that the constants not only provided a pre-calculated constant as you indicated but, taking it further, it also eliminated the recalculation efforts with each function call. It's really cool that the ability exists to define these constants with SQL CLR.

I think I can say quite confidently that the majority of performance improvement between the various flavors of the SQL CLR functions tested was due to implicit data type conversion avoidance. (Note the performance difference between DistanceNoGlobalConsts and DistanceGlobalConsts.) That said, both aspects obviously helped to make the function faster.

Post #956680
Posted Wednesday, July 21, 2010 1:12 PM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Tuesday, January 14, 2014 10:21 AM
Points: 36, Visits: 152
I have a comment regarding your T-SQL

In trying to keep all things equal, Shouldn't you declare a variable at the beginning of your function to hold the value of PI() / 180 instead of calculating it 4 times.

That was the first item I noticed.
Post #956688
Posted Wednesday, July 21, 2010 1:20 PM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Thursday, December 20, 2012 1:03 PM
Points: 265, Visits: 589
Thanks Matt for the comparisons. My experience has been similar with floating point calculations, between 50% to 90% improvement. For text (regex-type functions), the improvements were many fold. Of course the comparable T-SQL functions could themselves be improved but for the quickest bang-for-the-buck, CLRs really shine when the functions are well-defined. In a couple of cases, I took some previously existing functions written in C++ and made them into CLR functions so the calculations came out identical to external apps. This avoided a lot of discrepancies between T-SQL and external apps.
Post #956690
Posted Wednesday, July 21, 2010 1:22 PM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Thursday, April 14, 2011 7:04 AM
Points: 20, Visits: 177
Kevin Rathgeber (7/21/2010)
I have a comment regarding your T-SQL

In trying to keep all things equal, Shouldn't you declare a variable at the beginning of your function to hold the value of PI() / 180 instead of calculating it 4 times.

That was the first item I noticed.


Yep!

The point of the article wasn't to pit SQL CLR against T-SQL implementations (I debated whether to include the T-SQL results with it's current implementation in the article). Rather, it was to show that you don't have to use a sql data type in your SQL CLR (you can use native .net data types) and that doing so may result in performance gains - in my particular example, rather significant performance gains.

I figured Jeff would mention this shortly after I posted the requested T-SQL so I was anticipating it .

Good catch Kevin.

It should also be stated that this wouldn't make 'all things equal' per se. The SQL CLR implementation doesn't perform the calculation once, let alone 4 times with each call. By declaring a variable to eliminate the T-SQL redundancy, it still must be performed once with each call.
Post #956691
Posted Wednesday, July 21, 2010 1:32 PM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Tuesday, January 14, 2014 10:21 AM
Points: 36, Visits: 152
mzak (7/21/2010)
Kevin Rathgeber (7/21/2010)
I have a comment regarding your T-SQL

In trying to keep all things equal, Shouldn't you declare a variable at the beginning of your function to hold the value of PI() / 180 instead of calculating it 4 times.

That was the first item I noticed.


Yep!

The point of the article wasn't to pit SQL CLR against T-SQL implementations (I debated whether to include the T-SQL results with it's current implementation in the article). Rather, it was to show that you don't have to use a sql data type in your SQL CLR (you can use native .net data types) and that doing so may result in performance gains - in my particular example, rather significant performance gains.

I figured Jeff would mention this shortly after I posted the requested T-SQL so I was anticipating it .

Good catch Kevin.

It should also be stated that this wouldn't make 'all things equal' per se. The SQL CLR implementation doesn't perform the calculation once, let alone 4 times with each call. By declaring a variable to eliminate the T-SQL redundancy, it still must be performed once with each call.


Oops....posted without saying anything.

Good point that it wouldn't be equal, just a little more equal

Sorry to detract from your article, but in case you were interested, I just did some testing creating two functions. The first function calculates "PI() / 180" 4 times (your T-SQL example). The 2nd function assigns "PI() / 180" to a variable and then uses that variable 4 times (my suggestion). Surpisingly the 2nd one (my suggestion) was slower. The amount it was slower is pretty much insignificant.....by about 4/10ths of a second over 1million iterations. Long story short, ignore my comment
Post #956705
Posted Wednesday, July 21, 2010 2:13 PM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Thursday, April 14, 2011 7:04 AM
Points: 20, Visits: 177
Kevin Rathgeber (7/21/2010)
mzak (7/21/2010)
Kevin Rathgeber (7/21/2010)
I have a comment regarding your T-SQL

In trying to keep all things equal, Shouldn't you declare a variable at the beginning of your function to hold the value of PI() / 180 instead of calculating it 4 times.

That was the first item I noticed.


Yep!

The point of the article wasn't to pit SQL CLR against T-SQL implementations (I debated whether to include the T-SQL results with it's current implementation in the article). Rather, it was to show that you don't have to use a sql data type in your SQL CLR (you can use native .net data types) and that doing so may result in performance gains - in my particular example, rather significant performance gains.

I figured Jeff would mention this shortly after I posted the requested T-SQL so I was anticipating it .

Good catch Kevin.

It should also be stated that this wouldn't make 'all things equal' per se. The SQL CLR implementation doesn't perform the calculation once, let alone 4 times with each call. By declaring a variable to eliminate the T-SQL redundancy, it still must be performed once with each call.


Oops....posted without saying anything.

Good point that it wouldn't be equal, just a little more equal

Sorry to detract from your article, but in case you were interested, I just did some testing creating two functions. The first function calculates "PI() / 180" 4 times (your T-SQL example). The 2nd function assigns "PI() / 180" to a variable and then uses that variable 4 times (my suggestion). Surpisingly the 2nd one (my suggestion) was slower. The amount it was slower is pretty much insignificant.....by about 4/10ths of a second over 1million iterations. Long story short, ignore my comment


That's interesting. Although I wonder what the "cost" of declaring a variable is with respect to this scenario? If I get some time, I'll rewrite the T-SQL udf based on your suggestion and see what results.

Based on your test, I'd also be curious if changing it to eliminate some variable declarations would have much impact (remove @dDistance, @EarthRadius, etc). Also, there's no reason I couldn't assign a reasonable constant for pi() / 180.0 which essentially would get us to equality .
Post #956741
Posted Wednesday, July 21, 2010 4:18 PM
SSC Journeyman

SSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC Journeyman

Group: General Forum Members
Last Login: 2 days ago @ 10:28 AM
Points: 91, Visits: 833
I enjoyed reading this article.
Thank you for taking the time to do this.
Post #956818
« Prev Topic | Next Topic »

Add to briefcase ««1234»»»

Permissions Expand / Collapse