|
|
|
SSChampion
        
Group: General Forum Members
Last Login: Today @ 9:59 PM
Points: 20,119,
Visits: 13,661
|
|
Jonathan Kehayias (2/6/2009)
I didn't use his CLR method directly because it has an inherent flaw in its usage of Regex in it that I wouldn't make in my own code. Instead I used a functionally identical function:
Outstanding! Thanks for running the tests! Like I said, the one place that Matt Miller, Sergiy, and I proved that CLR's were definitely better for performance was in the area of RegEx and I was also absolutely sure that you could make a major improvement there, as well. Well done! And, I'm not being a bit ironic there... I mean it.
Because of the crud you have to go through with OLE automation and the slowness of the associated sp_OA* routines, I'd be willing to bet that a CLR would also beat the code I posted for performance. We didn't test that in any of our deep dives, but I'm sure it's true and a good CLR for that would be great.
But, like you said, performance isn't the only thing to consider when making the decision to use a CLR. Look at the installation code you had to write. Can a "newbie" actually do that as you suggested? I submit, probably not. And then, there are those hardcore DBA's that say "Not on my watch" when it comes to CLR's. Better know how to do several different ways... something a "newbie" wouldn't actually know, anyway.
As you've just shown, although there's a 50% difference in performance between that given CLR and the fastest code I wrote in T-SQL to do the same thing but, consider for a moment that it's only 60 milliseconds difference... across a million rows and extrapolating your run times on 10k rows vs a million rows, thats only 6 seconds difference... and I have two lines in a WHERE clause... with the CLR, there's source code to check in, a DLL to test and install, and an installation procedure to make the code useful as a CLR function. Lord help us if we migrate servers.
Just to be clear, I'm not saying that all CLR's should be avoided. You said it quite correctly when you said that some folks abuse them and you picked a very good example for what a CLR should probably be used for. In fact, I have no problem with your good article at all. One of my major points was that someone posted some a link to some pretty bad T-SQL and the race between that and a CLR and a test that probably wasn't even done correctly. CLR's get glamourized because of such mistakes much like set based code gets a bad name because someone writes some really bad set based code to compare to. I want to make sure that, because of that miserable article that someone referred us all to, that people don't go CLR crazy like they did when cursors were introduced in early versions (6.5 I believe) of SQL Server.
--Jeff Moden "RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".
First step towards the paradigm shift of writing Set Based code: Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."
For better, quicker answers, click on the following... http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following... http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
|
|
|
|
|
SSChampion
        
Group: General Forum Members
Last Login: Today @ 9:59 PM
Points: 20,119,
Visits: 13,661
|
|
Jonathan Kehayias (2/6/2009)
Jeff Moden (2/6/2009)
Heh... listen to you... Anyone capable of writing either the CLR or the T-SQL isn't going to miss the sp_OADestroy. Anyone who doesn't understand enough about T-SQL will also like not know enough about how to prevent memory leaks using a CLR.
Not true. I've seen this happen in consulting twice for OLE automation and a few times for XML documents that get created but not removed. Its not that difficult of a mistake to make. Leaking in CLR is only possible with UNSAFE assemblies, as a part of the design of the hosting environment. The ability to leak memory requires Host Protection Attributes that are prevented from loading in anything but UNSAFE. So does putting my directory code in a stored proc. And it doesn't require the monster code you included in your article just to install the CLR. You could do everyone a favor and just reference them to the text file rather than posting the varbinary output string of a compiled assembly in here and making this page of the forums unusable because it doesn't know how to handle a string that long. You aren't comparing similar code, by looking at the binary compiled assembly object. The code is the C# and should the following connect item get added to SQL Server: http://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=265266the ability to write code similar to the following will exist in SQL: CREATE PROCEDURE my_proc @par1 int, @par2 int WITH EXTERNAL LANGUAGE C#, PERMISSION_SET = EXTERNAL_ACCESS AS using System; using System.Data; using System.Data.SqlClient; ...
That's what I mean... what's the difference between someone forgetting to do an sp_OADestroy and someone forgetting to mark a CLR as something other than UNSAFE especially if they're a newbie? Even if it gives you a warning, is a newbie gonna know what to do about it?
But, point well taken, hopefully on both parts... you do have a caveat for both methods.
Ya beat me to it on the other thing... I didn't realize the binary was quite so wide in your code and started fixing it as you were typing. It's fixed now. Thanks for the heads up, though.
--Jeff Moden "RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".
First step towards the paradigm shift of writing Set Based code: Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."
For better, quicker answers, click on the following... http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following... http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
|
|
|
|
|
Ten Centuries
      
Group: General Forum Members
Last Login: Today @ 10:13 AM
Points: 1,310,
Visits: 1,228
|
|
Its taken me close to a year to create a series of articles that have sufficient utility to be worthwhile, and are logically good use of SQLCLR so as to not be chewed alive by the community. Replacing less secure methodologies is my favorite place to make use of SQLCLR, though generally speaking, I wouldn't be reading directories from my TSQL code either, so I don't really have these examples in use in my own environment.
The EmailRegex I do have in use in a data warehouse that I built that loads data from all different kinds of sources, so it is very useful in the data cleansing process post load. I didn't do that in CLR for performance, more than I already had the code from answering a forums post, and it was easily reused. The difference in performance between the TSQL and the CLR isn't enough to make it matter to me. Like I said, if you are talking ms what did you really save and do you really notice it?
As for the installation script I provided, could anybody do that. Sure, if they were willing to research the "best practice" implementation for a EXTERNAL_ACCESS Assembly. Most aren't willing to go that route and instead just flip TRUSTWORTHY ON and move on with deploying under dbo. In fact, I think I have one of the few articles on the web that actually uses Keys over Trustworthy for non-SAFE assemblies. Since my purpose was to demonstrate how to be more "secure" with SQLCLR, setting TRUSTWORTHY wouldn't have really been good to do, especially in master, so I did it the right way, rather than the easy way. The installer script here does provide a simple framework that can be used for any assembly. Just a little copy paste rename kind of stuff to do.
Jonathan Kehayias http://www.sqlclr.net http://sqlblog.com/blogs/jonathan_kehayias http://www.twitter.com/SQLSarg
|
|
|
|
|
Ten Centuries
      
Group: General Forum Members
Last Login: Today @ 10:13 AM
Points: 1,310,
Visits: 1,228
|
|
Jeff Moden (2/6/2009)
Ya beat me to it on the other thing... I didn't realize the binary was quite so wide in your code and started fixing it as you were typing. It's fixed now. Thanks for the heads up, though.
I wish there was a better way to provide that binary string in forums. At least here you can attach files to forums posts as text. Other places it is completely impossible to send someone a sample CLR compiled assembly for even a simple Hello World example. It is equally as long as the assembly I posted for reading the directory information.
Jonathan Kehayias http://www.sqlclr.net http://sqlblog.com/blogs/jonathan_kehayias http://www.twitter.com/SQLSarg
|
|
|
|
|
SSChampion
        
Group: General Forum Members
Last Login: Today @ 9:59 PM
Points: 20,119,
Visits: 13,661
|
|
Jonathan Kehayias (2/6/2009) so I did it the right way, rather than the easy way
BWAA-HAA!!! Man, I'm right there with you on that one! Now, all we have to do is convince the rest of the world, especially the managers that want it real bad, to give people just a little more time so they don't get it that way... real bad.
Good talking with you again, Jonathan. Kudos on the article.
--Jeff Moden "RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".
First step towards the paradigm shift of writing Set Based code: Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."
For better, quicker answers, click on the following... http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following... http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
|
|
|
|
|
Ten Centuries
      
Group: General Forum Members
Last Login: Today @ 10:13 AM
Points: 1,310,
Visits: 1,228
|
|
Jeff Moden (2/6/2009)
Jonathan Kehayias (2/6/2009) so I did it the right way, rather than the easy wayBWAA-HAA!!! Man, I'm right there with you on that one! Now, all we have to do is convince the rest of the world, especially the managers that want it real bad, to give people just a little more time so they don't get it that way... real bad.Good talking with you again, Jonathan. Kudos on the article.
I've got some more coming that it will be fun to see your comments on. All along this same line of processing, but its always interesting to see what you come up with for TSQL equivalents to the SQLCLR. As you say, the places I can write CLR to beat TSQL has been limited, and some of your tricks like appending a '/' at the end of a URL in TSQL to make domain name extraction lightening fast are very interesting and something to learn from.
Jonathan Kehayias http://www.sqlclr.net http://sqlblog.com/blogs/jonathan_kehayias http://www.twitter.com/SQLSarg
|
|
|
|
|
Valued Member
      
Group: General Forum Members
Last Login: Saturday, March 13, 2010 1:17 AM
Points: 51,
Visits: 234
|
|
This solution is based on the premise that security is important enough to take the time to implement. If time is more important than preventing unrestricted access to the OS as the SQL Service account with xp_cmdshell, then this solution probably isn't for you. There is nothing on the machine other than SQL Server databases protected by failover or log shipping. So what if the machine crashes; that's just a blip. If a hacker has ever been able to copy an .mdf file from a machine to outside the firewall because xp_cmdshell was turned on, I would like to read about it. Security isn't about the OS; it is about the company's data.
Data security principle -- developers make the access to the data visible to the administrators who are responsible for the security of production data, after the developers go on to other projects, by writing simple, compact T-SQL, tested for performance. What the application does with the data should not be hard to see inside a pile of clever code-- that is a data security issue. It's not about how easy or hard it is for the developer to write the code; it is about how the code is maintained and explained two years later.
I'm talking about database programming for business applications. I suppose UI considerations are primary for other applications.
My SQL Server policy is xp_cmdshell on and CLR off -- unless I hear of theft of an .mdf that was accomplished using xp_cmdshell.
And thanks to someone for fixing the window that displays code. It's normal width now.
_________________ "Look, those sheep have been shorn." data analyst replies, "On the sides that we can see.."
|
|
|
|
|
Old Hand
      
Group: General Forum Members
Last Login: Today @ 2:56 AM
Points: 388,
Visits: 1,518
|
|
That was it and I agree... someone like you wouldn't pass a bad path.
Remember the advice in 'The C Programming Language by Dennis M. Richie and Brian W. Kernighan' to always put in error handling even when an error couldn't possibly happen? I've lost count of the number of times I've thanked those guys for that advice. I've seen the error 'this cannot possibly happen' more times than I'd like to admit.
I'm not going to back down over my gentle advice that the CLR must react properly to all possible inputs. After all, it was because people neglected this that SQL Injection became possible. If a routine is for your own private use, then bless you, but if it is published on the internet then prepare for a comment by Phil.
Oh and another thing; it was particularly the use of the Asymmetric key that I was so pleased about in my original comment. It is sooo important.
Thanks for the update. I've now trashed my old CLR Directory routine, and adopted yours instead.
I don't yet see CLR as being a simple substitute for OLE automation. OLE is much quicker to lash together and maintain, if you have good templates and test harnesses. I see CLR more as a way of providing extensions to the SQL Server system to handle complex data. Now that is really exciting.......
Best wishes,
Phil Factor Simple Talk
|
|
|
|
|
Forum Newbie
      
Group: General Forum Members
Last Login: Tuesday, March 03, 2009 5:31 PM
Points: 6,
Visits: 116
|
|
|
|
|
|
SSChasing Mays
      
Group: General Forum Members
Last Login: Today @ 1:32 PM
Points: 666,
Visits: 1,697
|
|
Jeff Moden (2/6/2009) .... One of my major points was that someone posted some a link to some pretty bad T-SQL and the race between that and a CLR and a test that probably wasn't even done correctly.....
Boy, am I glad to be "someone" now..... I was perfectly aware that the T-SQL code is not the best. The idea of providing that link was to show the reasoning behind Andy's article on possible applications for SQLCLR. He explicitly wrote that the code is not the best.. I hope people read! Besides that, it shows with plenty-o-external-links the process on creating a CLR, informative for anyone..
Yes, yes, mea culpa on trying to show everyone bad coding and trying to point that "people don't go CLR crazy". If someone just takes any code thrown at them without even thinking on the consequences of applying it in their environment, they have no business being a DBA/Developer.
At least it sparked a series of very informative discussion between you and Jonathan, with your always excellent code samples :)
Cheers
_______________________________________________________________________ For better assistance in answering your questions, click here
|
|
|
|