Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase «««123

Using CLR integration to compress BLOBs/CLOBs in SQL Server 2005 Expand / Collapse
Author
Message
Posted Thursday, July 16, 2009 9:28 PM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Friday, September 12, 2014 11:18 AM
Points: 23, Visits: 293
I am on SP2.
Exact build number posted above.

Really wanted to avoid 3rd party. But if they post the code, it could be useful.
Shame that M$ could not just do it right to begin with. This kinds of use cases were easy to predict.

Thanks.
Post #754587
Posted Friday, July 17, 2009 7:13 AM
SSC Journeyman

SSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC Journeyman

Group: General Forum Members
Last Login: Wednesday, March 12, 2014 8:47 AM
Points: 76, Visits: 756
We implemented this technique successfully, but our Blobs were typically in the 10-50K range - never more that 5MB, and most access was via single row (2 or 3 blobs per row). We did test this across several million rows, including multi-row statements, and never saw any issues though our testing was limited to the blob sizes I mention earlier.
Post #754797
Posted Friday, July 17, 2009 7:47 AM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Friday, September 12, 2014 11:18 AM
Points: 23, Visits: 293
To be fair, we also can limit size of BLOBs to push through compression.

Average size of the document in our business case is about 100-300K. Those make up 80-90% of the data to handle. There is also a question of what to compress vs. what NOT to compress. PDFs and BMPs, for example, just don't like to be compressed. So, the compressible text documents we have tend to be smaller. This still might work.

I just wish we did not have to code in a bunch of exceptions....
And also did not have to code all bunch of loops there and here as a work-around.

This should have been simplistic 1-liner INSERT.... SELECT....fn_compress(BLOB)....
Geez.

Anyway, THANKS, fellows. I am getting good practical ideas out of this discussion. Still might work without resorting to 3-party complications.
Post #754835
Posted Friday, July 17, 2009 8:28 AM
SSC Journeyman

SSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC Journeyman

Group: General Forum Members
Last Login: Wednesday, March 12, 2014 8:47 AM
Points: 76, Visits: 756
Our content was text and xml, so compression rates were very high.
Good luck and keep us posted on your progress!
Post #754866
Posted Friday, July 17, 2009 10:15 AM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Tuesday, January 21, 2014 11:24 AM
Points: 12, Visits: 76
Make your compression function deterministic by decorating it with [Microsoft.SqlServer.Server.SqlFunction(ISDeterministic=true)] wrap the function in a T-SQL user define function and check the BLOB length with DATALENGTH() before calling the CRL compressor. If > 1 MB (or whatever the upper limit is) convert varbinary(MAX) and add a bitmap header indicating 0 (do not call the compression function), otherwise compress the BLOB add a bitmap header indicating compression.
For decompression check the bitmap header and call the decompression or just convert to text...
CREATE FUNCTION dbo.udf_compress(
@blob NVARCHAR(MAX))
RETURNS VARBINARY(MAX)
AS
BEGIN
IF(DATALENGTH(@blob)> (2 * 1024 * 1024))
RETURN CONVERT(VARBINARY(MAX), 0x0) + CONVERT(VARBINARY(MAX), @blob);

RETURN CONVERT(VARBINARY(MAX), 0x1) + fn_Compress(@blob);
END

To determine if it was compress:

SELECT CASE WHEN SUBSTRING(@compressed, 1, 1) > 0 THEN 'Decompress'
ELSE 'Convert'
END

Hope this helps
Post #754976
Posted Monday, July 20, 2009 9:03 AM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Friday, September 12, 2014 11:18 AM
Points: 23, Visits: 293
OK, I found that compressing the records in a loop (I used a cursor) gets the job done with no "out of memory" issues. It is still quick enough and also restartable if needed (with little extra management coded in, that is). Worked for me.
Post #755871
Posted Monday, October 19, 2009 9:26 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Thursday, November 26, 2009 3:05 PM
Points: 2, Visits: 39
hiii

I have a problem, i have a server in sql2005 and send to next error.

.NET Framework execution was aborted by escalation policy because of out of memory.

In team events, indicated sql then this post was blocked


Someone can back me up? because I believe the problem is not memory as such sql
Post #805134
Posted Tuesday, October 20, 2009 7:44 AM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Tuesday, January 21, 2014 11:24 AM
Points: 12, Visits: 76
Please provide SQL Server version and SP level for better support. I have seen that while working with VERY large BLOBs in pre SP1 servers (didn't try in SP2 and SP3). The problem is in you CLR integration sandbox while trying to allocate space for a VERY large array (i.e. Stream.Buffer). SQLCompress.NET solves that problem by streaming data out of the BLOB into another SQLBytes which uses tempdb instead of memory. If the problem persists and you really want compression on your database you could dowload SQLCompress.NET (which works with SQL 2008 as well) for that.
Post #805709
Posted Thursday, February 11, 2010 7:25 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Wednesday, January 16, 2013 8:33 AM
Points: 1, Visits: 25
Thanks very much, a very useful article!

There looks like some tiny issues with the code however.


1. Buffer is an internal buffer which may or may not contain the whole set of bytes in the blob.

byte[] blobData = blob.Buffer;

i think should be this --

byte[] blobData = blob.Value;

Or alternatively you could read chunks from the buffer. loop writing them into the compressor


2. The decompressedData needs to be flushed before closing.

decompressedData.Flush()



Without these changes my blobs had issues where the decompressed data was not identical to the original compressed data:

CHECKSUM(myBlob)<>CHECKSUM(Decompress(Compress(myBlob))
and
myBlob <> Decompress(Compress(myBlob))
Post #864059
Posted Wednesday, October 16, 2013 5:39 PM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Friday, May 30, 2014 4:30 AM
Points: 174, Visits: 179
It would be nice to extend same function to proper zip / unzip archival. Wikipedia has published zip archive format specifications. Here is the link.

http://en.wikipedia.org/wiki/ZIP_(file_format)

I did so in our internal project, and don't feel sharing code, but given good description it is accessible practically to any developer.
Note 16 byte data descriptor field, it will come handy if you don't feel like saving buffer(s) prior writing to stream.

It became somewhat important to our team due to third party accepting compressed data. My feeling is sharing code ( to enable consistent decompression ) does not work always. However, writing to common interface documented as standard is nice. Having ability to drop the file to Windows Explorer, double - click on it and see content comes as a bonus :)


Legal disclaimer:

Please do not treat this entry in any shape or form other than personal note.
Post #1505453
« Prev Topic | Next Topic »

Add to briefcase «««123

Permissions Expand / Collapse