As a side bar, Matt Miller and I had some friendly races a couple/three years ago on Regex and the results came out pretty close in a lot of the cases.
Hey Jeff. So, when you say that you implemented RegEx in T-SQL, you were able to emulate macros (such as \w and \d) or do things like capture groups and look-ahead operators? I know that the LIKE clause can do a very basic RegEx with single-character [a-z,0-9] syntax but not much else. I have seen that .Net RegEx is slightly slower than a LIKE clause but it seems to be 1000 times more powerful so I kinda like having the option to use the full RegEx functionality.
So far as FTP goes, that's not so difficult from a sproc if you can make a trip to the batch world. Same goes with file handling. Heh... yeah, I know... lot's of people can't go there. Never understood that with the likes of proxies, etc. Seems like there's a general paranoia about the Command Prompt these days. Done correctly, it's no worse than "safe" CLR's.
Regarding FTP and even GZip, I took a slightly different approach. In fact, I took a dual approach in which I did the standard FTP a file and GZip a file but then I also have functions that deal with the data directly. By "directly" I mean that you can FTP the contents of a local variable or column from a table. Same with GZip: you can GZip the contents of a variable or column such as:
SET tab.PDFCompressed = SQL#.Util_GZip(tab.PDFData)
FROM MyTable tab
WHERE tab.PDFCompressed IS NULL
Or that could have been a Stored Proc that got the PDF (or JPG / GIF) data passed in as a VARBINARY and that input variable was passed into Util_GZip so that the compressed version was actually stored:
INSERT INTO MyTable (PDFCompressed)
In this case, you can save room in the DB if you are storing binary data and still mask that it is compressed since any query (via proc, Reporting Services, ad-hoc query, etc.) can simply call:
SELECT SQL#.Util_GUnzip(tab.PDFCompressed) AS [PDFData]
FROM MyTable tab
WHERE tab.SomeField = @SomeValue
So, I think this way of looking at FTP and GZip offers some interesting options to people that xp_cmdshell implementations cannot. And yes, the new FileStream option in SQL Server 2008 might be even better, but again still good to have options, especially because you can interact with those as VARBINARY columns so these functions still might work the same way against those.
Another thing to consider is that by opening up xp_cmdshell you are giving access to anything that can be called from a command line, whereas in these File System CLR functions the user only has access to the few functions that have been implemented, not everything on the server. To safe guard in xp_cmdshell you would have to restrict the login associated with the SQL Server service to not even be able to enter / read certain directories, but that might then exclude calling "ftp" unless even more fine-grained permissions are applied. And while that is possible, it seems easier to just restrict xp_cmdshell altogether and then selectively GRANT / DENY access to some CLR procs/functions to whatever Logins and/or Roles should be using them.
Anyway, nice article and thanks for the feedback.
Thanks and yer welcome 🙂