SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


«««34567»»»

Trading in xp_cmdshell for SQLCLR (Part 1) - List Directory Contents Expand / Collapse
Author
Message
Posted Friday, February 06, 2009 10:36 PM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Today @ 9:59 PM
Points: 20,119, Visits: 13,661
Jonathan Kehayias (2/6/2009)

I didn't use his CLR method directly because it has an inherent flaw in its usage of Regex in it
that I wouldn't make in my own code. Instead I used a functionally identical function:


Outstanding! Thanks for running the tests! Like I said, the one place that Matt Miller, Sergiy,
and I proved that CLR's were definitely better for performance was in the area of RegEx and I
was also absolutely sure that you could make a major improvement there, as well. Well done!
And, I'm not being a bit ironic there... I mean it.

Because of the crud you have to go through with OLE automation and the slowness of the
associated sp_OA* routines, I'd be willing to bet that a CLR would also beat the code I
posted for performance. We didn't test that in any of our deep dives, but I'm sure it's true and
a good CLR for that would be great.

But, like you said, performance isn't the only thing to consider when making the decision to use
a CLR. Look at the installation code you had to write. Can a "newbie" actually do that as you
suggested? I submit, probably not. And then, there are those hardcore DBA's that say "Not on
my watch" when it comes to CLR's. Better know how to do several different ways... something
a "newbie" wouldn't actually know, anyway.

As you've just shown, although there's a 50% difference in performance between that given CLR
and the fastest code I wrote in T-SQL to do the same thing but, consider for a moment that
it's only 60 milliseconds difference... across a million rows and extrapolating your run times on 10k
rows vs a million rows, thats only 6 seconds difference... and I have two lines in a WHERE clause...
with the CLR, there's source code to check in, a DLL to test and install, and an installation procedure
to make the code useful as a CLR function. Lord help us if we migrate servers.

Just to be clear, I'm not saying that all CLR's should be avoided. You said it quite correctly
when you said that some folks abuse them and you picked a very good example for what a CLR
should probably be used for. In fact, I have no problem with your good article at all. One of my major
points was that someone posted some a link to some pretty bad T-SQL and the race between that
and a CLR and a test that probably wasn't even done correctly. CLR's get glamourized because of
such mistakes much like set based code gets a bad name because someone writes some really bad
set based code to compare to. I want to make sure that, because of that miserable article that someone
referred us all to, that people don't go CLR crazy like they did when cursors were introduced in early
versions (6.5 I believe) of SQL Server.



--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

For better, quicker answers, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/

For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
Post #652124
Posted Friday, February 06, 2009 10:51 PM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Today @ 9:59 PM
Points: 20,119, Visits: 13,661
Jonathan Kehayias (2/6/2009)
Jeff Moden (2/6/2009)

Heh... listen to you... Anyone capable of writing either the CLR or the T-SQL isn't going to miss the sp_OADestroy. Anyone who doesn't understand enough about T-SQL will also like not know enough about how to prevent memory leaks using a CLR.


Not true. I've seen this happen in consulting twice for OLE automation and a few times for XML documents that get created but not removed. Its not that difficult of a mistake to make. Leaking in CLR is only possible with UNSAFE assemblies, as a part of the design of the hosting environment. The ability to leak memory requires Host Protection Attributes that are prevented from loading in anything but UNSAFE.

So does putting my directory code in a stored proc. And it doesn't require the monster code you included in your article just to install the CLR.


You could do everyone a favor and just reference them to the text file rather than posting the varbinary output string of a compiled assembly in here and making this page of the forums unusable because it doesn't know how to handle a string that long. You aren't comparing similar code, by looking at the binary compiled assembly object. The code is the C# and should the following connect item get added to SQL Server:

http://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=265266

the ability to write code similar to the following will exist in SQL:

CREATE PROCEDURE my_proc @par1 int,
@par2 int WITH
EXTERNAL LANGUAGE C#, PERMISSION_SET = EXTERNAL_ACCESS AS
using System;
using System.Data;
using System.Data.SqlClient;
...



That's what I mean... what's the difference between someone forgetting to do an sp_OADestroy and someone forgetting to mark a CLR as something other than UNSAFE especially if they're a newbie? Even if it gives you a warning, is a newbie gonna know what to do about it?

But, point well taken, hopefully on both parts... you do have a caveat for both methods.

Ya beat me to it on the other thing... I didn't realize the binary was quite so wide in your code and started fixing it as you were typing. It's fixed now. Thanks for the heads up, though.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

For better, quicker answers, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/

For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
Post #652125
Posted Friday, February 06, 2009 10:53 PM


Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Today @ 10:13 AM
Points: 1,310, Visits: 1,228
Its taken me close to a year to create a series of articles that have sufficient utility to be worthwhile, and are logically good use of SQLCLR so as to not be chewed alive by the community. Replacing less secure methodologies is my favorite place to make use of SQLCLR, though generally speaking, I wouldn't be reading directories from my TSQL code either, so I don't really have these examples in use in my own environment.

The EmailRegex I do have in use in a data warehouse that I built that loads data from all different kinds of sources, so it is very useful in the data cleansing process post load. I didn't do that in CLR for performance, more than I already had the code from answering a forums post, and it was easily reused. The difference in performance between the TSQL and the CLR isn't enough to make it matter to me. Like I said, if you are talking ms what did you really save and do you really notice it?

As for the installation script I provided, could anybody do that. Sure, if they were willing to research the "best practice" implementation for a EXTERNAL_ACCESS Assembly. Most aren't willing to go that route and instead just flip TRUSTWORTHY ON and move on with deploying under dbo. In fact, I think I have one of the few articles on the web that actually uses Keys over Trustworthy for non-SAFE assemblies. Since my purpose was to demonstrate how to be more "secure" with SQLCLR, setting TRUSTWORTHY wouldn't have really been good to do, especially in master, so I did it the right way, rather than the easy way. The installer script here does provide a simple framework that can be used for any assembly. Just a little copy paste rename kind of stuff to do.


Jonathan Kehayias
http://www.sqlclr.net
http://sqlblog.com/blogs/jonathan_kehayias
http://www.twitter.com/SQLSarg
Post #652126
Posted Friday, February 06, 2009 10:58 PM


Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Today @ 10:13 AM
Points: 1,310, Visits: 1,228
Jeff Moden (2/6/2009)

Ya beat me to it on the other thing... I didn't realize the binary was quite so wide in your code and started fixing it as you were typing. It's fixed now. Thanks for the heads up, though.


I wish there was a better way to provide that binary string in forums. At least here you can attach files to forums posts as text. Other places it is completely impossible to send someone a sample CLR compiled assembly for even a simple Hello World example. It is equally as long as the assembly I posted for reading the directory information.


Jonathan Kehayias
http://www.sqlclr.net
http://sqlblog.com/blogs/jonathan_kehayias
http://www.twitter.com/SQLSarg
Post #652127
Posted Friday, February 06, 2009 11:15 PM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Today @ 9:59 PM
Points: 20,119, Visits: 13,661
Jonathan Kehayias (2/6/2009)
so I did it the right way, rather than the easy way


BWAA-HAA!!! Man, I'm right there with you on that one! Now, all we have to do is convince the rest of the world, especially the managers that want it real bad, to give people just a little more time so they don't get it that way... real bad.

Good talking with you again, Jonathan. Kudos on the article.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

For better, quicker answers, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/

For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
Post #652130
Posted Friday, February 06, 2009 11:30 PM


Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Today @ 10:13 AM
Points: 1,310, Visits: 1,228
Jeff Moden (2/6/2009)
Jonathan Kehayias (2/6/2009)
so I did it the right way, rather than the easy way


BWAA-HAA!!! Man, I'm right there with you on that one! Now, all we have to do is convince the rest of the world, especially the managers that want it real bad, to give people just a little more time so they don't get it that way... real bad.

Good talking with you again, Jonathan. Kudos on the article.


I've got some more coming that it will be fun to see your comments on. All along this same line of processing, but its always interesting to see what you come up with for TSQL equivalents to the SQLCLR. As you say, the places I can write CLR to beat TSQL has been limited, and some of your tricks like appending a '/' at the end of a URL in TSQL to make domain name extraction lightening fast are very interesting and something to learn from.


Jonathan Kehayias
http://www.sqlclr.net
http://sqlblog.com/blogs/jonathan_kehayias
http://www.twitter.com/SQLSarg
Post #652133
Posted Saturday, February 07, 2009 1:06 AM


Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Saturday, March 13, 2010 1:17 AM
Points: 51, Visits: 234
This solution is based on the premise that security is important enough to take the time to implement. If time is more important than preventing unrestricted access to the OS as the SQL Service account with xp_cmdshell, then this solution probably isn't for you.

There is nothing on the machine other than SQL Server databases protected by failover or log shipping. So what if the machine crashes; that's just a blip. If a hacker has ever been able to copy an .mdf file from a machine to outside the firewall because xp_cmdshell was turned on, I would like to read about it. Security isn't about the OS; it is about the company's data.

Data security principle -- developers make the access to the data visible to the administrators who are responsible for the security of production data, after the developers go on to other projects, by writing simple, compact T-SQL, tested for performance. What the application does with the data should not be hard to see inside a pile of clever code-- that is a data security issue. It's not about how easy or hard it is for the developer to write the code; it is about how the code is maintained and explained two years later.

I'm talking about database programming for business applications. I suppose UI considerations are primary for other applications.

My SQL Server policy is xp_cmdshell on and CLR off -- unless I hear of theft of an .mdf that was accomplished using xp_cmdshell.

And thanks to someone for fixing the window that displays code. It's normal width now.



_________________
"Look, those sheep have been shorn."
data analyst replies, "On the sides that we can see.."
Post #652144
Posted Saturday, February 07, 2009 4:04 AM


Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Today @ 2:56 AM
Points: 388, Visits: 1,518

That was it and I agree... someone like you wouldn't pass a bad path.


Remember the advice in 'The C Programming Language by Dennis M. Richie and Brian W. Kernighan' to always put in error handling even when an error couldn't possibly happen? I've lost count of the number of times I've thanked those guys for that advice. I've seen the error 'this cannot possibly happen' more times than I'd like to admit.

I'm not going to back down over my gentle advice that the CLR must react properly to all possible inputs. After all, it was because people neglected this that SQL Injection became possible. If a routine is for your own private use, then bless you, but if it is published on the internet then prepare for a comment by Phil.

Oh and another thing; it was particularly the use of the Asymmetric key that I was so pleased about in my original comment. It is sooo important.

Thanks for the update. I've now trashed my old CLR Directory routine, and adopted yours instead.

I don't yet see CLR as being a simple substitute for OLE automation. OLE is much quicker to lash together and maintain, if you have good templates and test harnesses. I see CLR more as a way of providing extensions to the SQL Server system to handle complex data. Now that is really exciting.......



Best wishes,

Phil Factor
Simple Talk
Post #652166
Posted Saturday, February 07, 2009 6:25 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Tuesday, March 03, 2009 5:31 PM
Points: 6, Visits: 116
Great article Jonathan!
Post #652183
Posted Saturday, February 07, 2009 7:25 AM
SSChasing Mays

SSChasing MaysSSChasing MaysSSChasing MaysSSChasing MaysSSChasing MaysSSChasing MaysSSChasing MaysSSChasing Mays

Group: General Forum Members
Last Login: Today @ 1:32 PM
Points: 666, Visits: 1,697
Jeff Moden (2/6/2009)
.... One of my major points was that someone posted some a link to some pretty bad T-SQL and the race between that
and a CLR and a test that probably wasn't even done correctly.....


Boy, am I glad to be "someone" now.....
I was perfectly aware that the T-SQL code is not the best. The idea of providing that link was to show the reasoning behind Andy's article on possible applications for SQLCLR. He explicitly wrote that the code is not the best.. I hope people read!
Besides that, it shows with plenty-o-external-links the process on creating a CLR, informative for anyone..

Yes, yes, mea culpa on trying to show everyone bad coding and trying to point that "people don't go CLR crazy". If someone just takes any code thrown at them without even thinking on the consequences of applying it in their environment, they have no business being a DBA/Developer.

At least it sparked a series of very informative discussion between you and Jonathan, with your always excellent code samples :)

Cheers




_______________________________________________________________________
For better assistance in answering your questions, click here
Post #652188
« Prev Topic | Next Topic »

«««34567»»»

Permissions Expand / Collapse