Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase

Extracting files from "binary" stored in TEXT datatype Expand / Collapse
Author
Message
Posted Sunday, May 12, 2013 1:25 PM
Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Monday, December 15, 2014 10:36 AM
Points: 312, Visits: 323
I've been asked to look into extracting files (mostly jpg or pdf) from a database where the files are stored in the TEXT data type. There isn't a supporting column in the ole binaries table indicating the file type or extension. The vendor's database currently sits on a SQL 2008 R2 instance, but I do know that it was originally developed and ran on SQL 2000, if not earlier.

Since it was originally developed prior to SQL 2005, and since I'm assuming the software vendor didn't subsequently rewrite their original ole extraction to varbinary bulk extraction, I'm left guessing what legacy method the original developers might have used to import and export embedded files to binary in their database. Their front-end is black box, so I can't see the source code they use.

I do know the data resides on a TEXT data field, and the column value size is typically something like 5637935 characters. Selecting the field value in SSMS yields something like ÿØÿáN&Exif .

Does any of this ring a bell for the group, and is there anyone who might be able to point me in the right direction that I might extract the files directly from the database?

Thanks in advance for your consideration.

Post #1451926
Posted Monday, May 13, 2013 7:29 AM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Yesterday @ 3:05 PM
Points: 13,320, Visits: 12,804
I would guess that they used some sort of stream writer. You are going to have to write an application for this. There is no t-sql that can do this that I am aware of. Basically you read the contents into a byte array to "reassemble" the gibberish into a file. You could probably convert this into a CLR if you need to use it repeatedly.

If you are doing this in .NET the basic syntax would be very close to this. I am assuming that dt is a DataTable that has the text column as a column in the DataTable named "PDF".

System.IO.File.WriteAllBytes(@"C:\YourFolder\YourFile.pdf", dt.Rows[0]["PDF"] as byte[]);



_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Moden's splitter.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs
Understanding and Using APPLY (Part 1)
Understanding and Using APPLY (Part 2)
Post #1452093
Posted Monday, May 13, 2013 8:41 AM
SSC Eights!

SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!

Group: General Forum Members
Last Login: Monday, December 15, 2014 9:03 AM
Points: 851, Visits: 5,596
Your key problem will be working out what the encoding is.

It is probably Base64. This site may help:

http://www.opinionatedgeek.com/dotnet/tools/base64decode/
Post #1452148
Posted Monday, May 13, 2013 1:17 PM
Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Monday, December 15, 2014 10:36 AM
Points: 312, Visits: 323
Thanks for the pointers! I will try these both tonight.

Attempting to do this in CLR or .NET shouldn't be too difficult.

This is my first attempt to encode / decode to embedded objects in anything prior to SQL 2005. I guess I've gotten spoiled by varbinary(max) data type and BulkColumn FROM OPENROWSET, etc.

I'll let you know in the next day or so whether I have any success.

Thanks,

Andre Ranieri
Post #1452287
Posted Wednesday, May 15, 2013 8:05 AM
Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Monday, December 15, 2014 10:36 AM
Points: 312, Visits: 323
The object is using the deprecated Text datatype, and reading the string directly from the database shows a text header like ÿØÿá6ªExif, which looks a lot like the EXIF headers if you read the contents of a jpeg file directly in notepad.

I can use a stream to copy the contents of one jpeg file to another. Can I use that same stream to write directly to the text field in the vendor's SQL database? If so, how would this be done?

Andre




string fromPath = @"c:\temp\test1.jpg";
string toPath = @"c:\temp\test2.jpg";

using (Stream source = File.OpenRead(fromPath))
using (Stream dest = File.Create(toPath))
{
byte[] buffer = new byte[1024];
int bytes;
while ((bytes = source.Read(buffer, 0, buffer.Length)) > 0)
{
dest.Write(buffer, 0, bytes);
}
}
Post #1453114
« Prev Topic | Next Topic »

Add to briefcase

Permissions Expand / Collapse