Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 123»»»

Using CLR integration to compress BLOBs/CLOBs in SQL Server 2005 Expand / Collapse
Author
Message
Posted Tuesday, January 9, 2007 10:20 AM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Tuesday, January 21, 2014 11:24 AM
Points: 12, Visits: 76
Comments posted here are about the content posted at http://www.sqlservercentral.com/columnists/ymartinez/2798.asp
Post #335420
Posted Tuesday, January 30, 2007 8:34 AM
Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Thursday, January 30, 2014 10:08 PM
Points: 1,038, Visits: 444
Thanks for the article - I had read it on CodeProject a couple of weeks ago (http://www.codeproject.com/cs/database/blob_compress.asp). Wanted to make sure each site had the same author, which it appears is the case!

It's a clever application of .NET procs to do something that is definitely not suitable for T-SQL. I do wonder though if this is something better done by the business layer to avoid complex processing of data on the DB Server. Also, the compression of data across the link between the client/web-server & DB tiers may be desired - (de)compressing at the DB server helps with storage but not with traffic levels.
However, the ideas presented could certainly be useful in smaller 2-tier apps where the DB server is on a LAN and transparent access to the decompressed BLOB data via a view is desired. I'd be interested in hearing about any experiences people have had with such an approach...
Thanks for the article



Post #340867
Posted Tuesday, January 30, 2007 2:27 PM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Tuesday, September 11, 2007 5:42 PM
Points: 46, Visits: 1
Thanks for the article.
Honestly though, I'm not sure how useful this will be in practice. To begin with, as mentioned, this is not going to be a good way to deal with larger objects (larger than 5MB was mentioned at the end of the article). Also, moving a compress/decompress function into the database's process could be a nightmare for performance.
Overall, the idea of compressing data before pushing it into a BLOB is a great idea, but this should almost certainly be handled in application code rather than inside the database.
Still, it does get one thinking about it.
Post #341037
Posted Tuesday, January 30, 2007 2:56 PM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Monday, August 4, 2014 7:59 AM
Points: 40, Visits: 49

I understand that your primary intention is "to show how to use CLR integration in SQL Server to extend the engines functionality and provides an overview on what can be done with the new BLOB/CLOB data", but this is exactly the kind of thing you shouldn't do with CLR integration.

Sure, there are exceptions, such as a single-user app using SQL Express, but for a robust, multi-tier, multi-user solution, this would have dire effects on the performance of the database server. As others have commented, this is best done in another tier of the solution.

You need to make it very, very clear that this would be a poor architectural choice.

Post #341052
Posted Tuesday, January 30, 2007 4:20 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: Administrators
Last Login: Today @ 6:13 PM
Points: 33,198, Visits: 15,341
Good points there, but I guess I'd like to hear more about why this is a bad move or where it makes sense and where it doesn't. If you're driving the home page of a website, maybe this is bad.

If you're storing off purchase order images that will rarely be accessed, is this a bad decision? Even in a multi-user system?

Maybe someone will write some architectural articles about how to recognize good and bad uses of the CLR







Follow me on Twitter: @way0utwest

Forum Etiquette: How to post data/code on a forum to get the best help
Post #341079
Posted Tuesday, January 30, 2007 6:54 PM


SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Thursday, May 29, 2014 9:29 AM
Points: 214, Visits: 647
Interesting article - nice job.

Oh how I love the posts stating "don't do this or that" ... you'd think that their authors either 1) got too bloated on Microsoft documentation circa 1998, or 2) they've been taking Joe Celko posts at face value . I often think of a passage from Ken Henderson's second SQL Server 2000 book, when he quotes Steve McConnell in the following;

"To a great designer, not applying knowledge is tantamount to not having obtained the knowledge in the first place" (Henderson, 2002).

I would hope that those same readers would at least post their alternative, stating the reasons for their belief to go along with their rebuttal, rather than a simple “nay”.

Frankly, I enjoy the creativity here. If not for the idea in its form as presented, certainly for the thinking behind it, the very type of creativity that makes us all better at database development. Lest we not forget that someday the relational database will not be as we know it now. In fact, many are surprised that those concepts invented back in the mid to late 1970's are still around! I believe that in the next 20 years, advances in hardware, as example, might lead us to not even think of what were once "data retrieval problems" as we now know them. Dare we venture to think that what was once unacceptable in the relational database model might gain acceptance? Sure, you may never use this type of coding, but who cares? Completely immaterial from my point of view.

Again - nice job here. Thank goodness that we didn’t get another article on SET NOCOUNT ON, ROWCOUNT, or something so exciting as the advantages of SET over SELECT.


References:
Henderson, Ken (2002). The Guru's Guide to SQL Server Stored Procedures, XML, and HTML. Indianapolis: Addison-Wesley.







Lee Everest

Post #341109
Posted Tuesday, January 30, 2007 8:28 PM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Tuesday, January 21, 2014 11:24 AM
Points: 12, Visits: 76

Thanks to all of you for taking the time to comment on this article, there are some valid points here (and some I disagree with). A valid point: This should be done in another tier. For highly concurrent solutions, like an enterprise-level faxing application or a large website, this approach will present performance challenges. But for such applications you probably will use unmanaged code RIGHT!?

I disagree with “but this is exactly the kind of thing you shouldn't do with CLR integration”.  Here is a rule of thumb: you should not replace things better left to T-SQL with CLR integration. I can’t imagine how you would do something like this in T-SQL!

I also recall a SQL 2000 application for server side encryption and it was not exactly a poor architectural choice. This is “A solution” not “THE solution”. It does provide the flexibility to avoid investing in application changes, testing and redeployment, BUT you should not use it for highly scalable, highly concurrent applications.

Let me answer the question about possible applications, specially the third party product I mentioned in my article. The product is currently being integrated into a government archiving application, a change tracking application and reviewed by a large financial software provider for document management.

About performance, the worst part is returning compressed data. It takes about 2/3 of the time and it is proportional to blob size. Some preliminary tests in a dual core Pentium 1.4 GHz with 2 GB of ram show about 40 MB per second decompression time (around 30 MB/s for compression). That should give you some room before you decide to rewrite an application layer (or buy a new disk array). All in all, I advise you to take the solution, test its performance under regular load and if it is good enough use it. Just remember to monitor your systems to avoid inadvertently outgrowing it (but that probably goes for everything else you write).

Once again, thanks for your comments and I look forward to answering any questions or inquiries that might arise.

Post #341140
Posted Wednesday, January 31, 2007 1:42 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Friday, March 28, 2014 9:15 AM
Points: 4, Visits: 126

And what about to also encrypt the compressed content? Months ago, I was writing a UDT that automatically compress/decompress the content.  Not too much useful since and UDT can't be bigger than 8K, but I found that automatic encryption for the compressed content would be a nice feature.

Again, we can discuss if this should be done by database layer using CLR or by application layer, but as I explain in the article, if others RDBMS are offering this feature, probably someone is using it.

You can find this UDT at http://jcarlossaez.blogspot.com/2006/11/compressed-and-encrypted-user-defined.html

Post #341206
Posted Wednesday, January 31, 2007 7:47 AM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Tuesday, January 21, 2014 11:24 AM
Points: 12, Visits: 76

Juan Carlos:

SQL Server 2005 has built-in encryption, a pretty cool feature that MS acquire from three different vendors. Check the books online, you probably find plenty of documentation there. Having implemented encryption for SQL 2000 myself, I can tell you that managing keys is the biggest problem you will face, but 2005 already fixed that.

You are right, other RDBMS have server-side column-level compression, MySQL is one of them. There must be a reason for including such functionality, and it works just fine for them. Why can't we do it in SQL Server? Is it less of a database engine?  But then again, it is a matter of approach and what works for you based on what challenges your environment post.

Post #341318
Posted Wednesday, January 31, 2007 10:58 AM
SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Tuesday, May 6, 2014 5:51 AM
Points: 6,266, Visits: 2,028
This sort of application as well as the encryption one do make a lot of sense in mobile devices. In there you need to minimize space and maximize security.

That said neither of the two make sense on *large* database deployments. So I think Steve was right on the money it is where you use this what make or breaks the usefulness.

Just my $0.02



* Noel
Post #341392
« Prev Topic | Next Topic »

Add to briefcase 123»»»

Permissions Expand / Collapse