Extracting files from "binary" stored in TEXT datatype

  • I've been asked to look into extracting files (mostly jpg or pdf) from a database where the files are stored in the TEXT data type. There isn't a supporting column in the ole binaries table indicating the file type or extension. The vendor's database currently sits on a SQL 2008 R2 instance, but I do know that it was originally developed and ran on SQL 2000, if not earlier.

    Since it was originally developed prior to SQL 2005, and since I'm assuming the software vendor didn't subsequently rewrite their original ole extraction to varbinary bulk extraction, I'm left guessing what legacy method the original developers might have used to import and export embedded files to binary in their database. Their front-end is black box, so I can't see the source code they use.

    I do know the data resides on a TEXT data field, and the column value size is typically something like 5637935 characters. Selecting the field value in SSMS yields something like ÿØÿáN&Exif .

    Does any of this ring a bell for the group, and is there anyone who might be able to point me in the right direction that I might extract the files directly from the database?

    Thanks in advance for your consideration.

  • I would guess that they used some sort of stream writer. You are going to have to write an application for this. There is no t-sql that can do this that I am aware of. Basically you read the contents into a byte array to "reassemble" the gibberish into a file. You could probably convert this into a CLR if you need to use it repeatedly.

    If you are doing this in .NET the basic syntax would be very close to this. I am assuming that dt is a DataTable that has the text column as a column in the DataTable named "PDF".

    System.IO.File.WriteAllBytes(@"C:\YourFolder\YourFile.pdf", dt.Rows[0]["PDF"] as byte[]);

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Your key problem will be working out what the encoding is.

    It is probably Base64. This site may help:

    http://www.opinionatedgeek.com/dotnet/tools/base64decode/

  • Thanks for the pointers! I will try these both tonight.

    Attempting to do this in CLR or .NET shouldn't be too difficult.

    This is my first attempt to encode / decode to embedded objects in anything prior to SQL 2005. I guess I've gotten spoiled by varbinary(max) data type and BulkColumn FROM OPENROWSET, etc.

    I'll let you know in the next day or so whether I have any success.

    Thanks,

    Andre Ranieri

  • The object is using the deprecated Text datatype, and reading the string directly from the database shows a text header like ÿØÿá6ªExif, which looks a lot like the EXIF headers if you read the contents of a jpeg file directly in notepad.

    I can use a stream to copy the contents of one jpeg file to another. Can I use that same stream to write directly to the text field in the vendor's SQL database? If so, how would this be done?

    Andre

    string fromPath = @"c:\temp\test1.jpg";

    string toPath = @"c:\temp\test2.jpg";

    using (Stream source = File.OpenRead(fromPath))

    using (Stream dest = File.Create(toPath))

    {

    byte[] buffer = new byte[1024];

    int bytes;

    while ((bytes = source.Read(buffer, 0, buffer.Length)) > 0)

    {

    dest.Write(buffer, 0, bytes);

    }

    }

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply