I understand that your primary intention is "to show how to use CLR integration in SQL Server to extend the engines functionality and provides an overview on what can be done with the new BLOB/CLOB data", but this is exactly the kind of thing you shouldn't do with CLR integration.
Sure, there are exceptions, such as a single-user app using SQL Express, but for a robust, multi-tier, multi-user solution, this would have dire effects on the performance of the database server. As others have commented, this is best done in another tier of the solution.
You need to make it very, very clear that this would be a poor architectural choice.
Thanks to all of you for taking the time to comment on this article, there are some valid points here (and some I disagree with). A valid point: This should be done in another tier. For highly concurrent solutions, like an enterprise-level faxing application or a large website, this approach will present performance challenges. But for such applications you probably will use unmanaged code RIGHT!?
I disagree with “but this is exactly the kind of thing you shouldn't do with CLR integration”. Here is a rule of thumb: you should not replace things better left to T-SQL with CLR integration. I can’t imagine how you would do something like this in T-SQL!
I also recall a SQL 2000 application for server side encryption and it was not exactly a poor architectural choice. This is “A solution” not “THE solution”. It does provide the flexibility to avoid investing in application changes, testing and redeployment, BUT you should not use it for highly scalable, highly concurrent applications.
Let me answer the question about possible applications, specially the third party product I mentioned in my article. The product is currently being integrated into a government archiving application, a change tracking application and reviewed by a large financial software provider for document management.
About performance, the worst part is returning compressed data. It takes about 2/3 of the time and it is proportional to blob size. Some preliminary tests in a dual core Pentium 1.4 GHz with 2 GB of ram show about 40 MB per second decompression time (around 30 MB/s for compression). That should give you some room before you decide to rewrite an application layer (or buy a new disk array). All in all, I advise you to take the solution, test its performance under regular load and if it is good enough use it. Just remember to monitor your systems to avoid inadvertently outgrowing it (but that probably goes for everything else you write).
Once again, thanks for your comments and I look forward to answering any questions or inquiries that might arise.
And what about to also encrypt the compressed content? Months ago, I was writing a UDT that automatically compress/decompress the content. Not too much useful since and UDT can't be bigger than 8K, but I found that automatic encryption for the compressed content would be a nice feature.
Again, we can discuss if this should be done by database layer using CLR or by application layer, but as I explain in the article, if others RDBMS are offering this feature, probably someone is using it.
You can find this UDT at http://jcarlossaez.blogspot.com/2006/11/compressed-and-encrypted-user-defined.html
SQL Server 2005 has built-in encryption, a pretty cool feature that MS acquire from three different vendors. Check the books online, you probably find plenty of documentation there. Having implemented encryption for SQL 2000 myself, I can tell you that managing keys is the biggest problem you will face, but 2005 already fixed that.
You are right, other RDBMS have server-side column-level compression, MySQL is one of them. There must be a reason for including such functionality, and it works just fine for them. Why can't we do it in SQL Server? Is it less of a database engine? But then again, it is a matter of approach and what works for you based on what challenges your environment post.