Unzipping Word Documents in T-SQL

  • Comments posted to this topic are about the item Unzipping Word Documents in T-SQL

  • Wow, you've obviously had a lot of 'fun' with this, well done!

    Suggesting that image files are not compressed is, however, incorrect. They are heavily compressed, just not using Zip.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • You are right Phil. Images are heavily compressed. I did not express myself correctly as I meant that images are not compressed XML files. You are also right saying I had fun writing this function!

    • This reply was modified 5 months, 2 weeks ago by  cmartel 20772.
  • Pretty cool, just one thing to consider.

    SQL is the most expensive CPU license that one might pay for.

    For that reason, I'm out...

  • Wow!  Ingenious.

    It's surprising that DECOMPRESS(..) and COMPRESS(..) work with true gzip files (rather than a raw stream from the deflate algorithm) but then I suppose anyone wanting compressed storage for SQL Server should use the built-in compression features, and this function is just for those oddball "interoperate with other systems" type situations.

    I'm impressed you construct a fake GZIP file by assembling the header, deflate contents, CRC32 etc from the data that's stored in the ZIP directory.

    I can see how this is useful for one off tasks when you have ten thousand docx file and nobody bothered to store the metadata (author, title, etc) for each file in a nice normal table!

  • Hello Brain2000,

    You are right mentioning that SQL Server licenses are expensive, and I admit that unzipping on the server is unusual and unexpected.

    Like Ool2t5902 mentions, I use the function to extract metadata. Actually, I use a stripped version of the function that only decompresses and returns the required xml file, and this is pretty fast. I published a full function for clarity purposes.

    Thanks.

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply