SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Extracting files from "binary" stored in TEXT datatype


Extracting files from "binary" stored in TEXT datatype

Author
Message
Andre Ranieri
Andre Ranieri
Mr or Mrs. 500
Mr or Mrs. 500 (555 reputation)Mr or Mrs. 500 (555 reputation)Mr or Mrs. 500 (555 reputation)Mr or Mrs. 500 (555 reputation)Mr or Mrs. 500 (555 reputation)Mr or Mrs. 500 (555 reputation)Mr or Mrs. 500 (555 reputation)Mr or Mrs. 500 (555 reputation)

Group: General Forum Members
Points: 555 Visits: 379
I've been asked to look into extracting files (mostly jpg or pdf) from a database where the files are stored in the TEXT data type. There isn't a supporting column in the ole binaries table indicating the file type or extension. The vendor's database currently sits on a SQL 2008 R2 instance, but I do know that it was originally developed and ran on SQL 2000, if not earlier.

Since it was originally developed prior to SQL 2005, and since I'm assuming the software vendor didn't subsequently rewrite their original ole extraction to varbinary bulk extraction, I'm left guessing what legacy method the original developers might have used to import and export embedded files to binary in their database. Their front-end is black box, so I can't see the source code they use.

I do know the data resides on a TEXT data field, and the column value size is typically something like 5637935 characters. Selecting the field value in SSMS yields something like ÿØÿáN&Exif .

Does any of this ring a bell for the group, and is there anyone who might be able to point me in the right direction that I might extract the files directly from the database?

Thanks in advance for your consideration.
Sean Lange
Sean Lange
One Orange Chip
One Orange Chip (25K reputation)One Orange Chip (25K reputation)One Orange Chip (25K reputation)One Orange Chip (25K reputation)One Orange Chip (25K reputation)One Orange Chip (25K reputation)One Orange Chip (25K reputation)One Orange Chip (25K reputation)

Group: General Forum Members
Points: 25797 Visits: 17509
I would guess that they used some sort of stream writer. You are going to have to write an application for this. There is no t-sql that can do this that I am aware of. Basically you read the contents into a byte array to "reassemble" the gibberish into a file. You could probably convert this into a CLR if you need to use it repeatedly.

If you are doing this in .NET the basic syntax would be very close to this. I am assuming that dt is a DataTable that has the text column as a column in the DataTable named "PDF".


System.IO.File.WriteAllBytes(@"C:\YourFolder\YourFile.pdf", dt.Rows[0]["PDF"] as byte[]);



_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Modens splitter.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs
Understanding and Using APPLY (Part 1)
Understanding and Using APPLY (Part 2)
Ken McKelvey
Ken McKelvey
SSCommitted
SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)

Group: General Forum Members
Points: 1657 Visits: 7879
Your key problem will be working out what the encoding is.

It is probably Base64. This site may help:

http://www.opinionatedgeek.com/dotnet/tools/base64decode/
Andre Ranieri
Andre Ranieri
Mr or Mrs. 500
Mr or Mrs. 500 (555 reputation)Mr or Mrs. 500 (555 reputation)Mr or Mrs. 500 (555 reputation)Mr or Mrs. 500 (555 reputation)Mr or Mrs. 500 (555 reputation)Mr or Mrs. 500 (555 reputation)Mr or Mrs. 500 (555 reputation)Mr or Mrs. 500 (555 reputation)

Group: General Forum Members
Points: 555 Visits: 379
Thanks for the pointers! I will try these both tonight.

Attempting to do this in CLR or .NET shouldn't be too difficult.

This is my first attempt to encode / decode to embedded objects in anything prior to SQL 2005. I guess I've gotten spoiled by varbinary(max) data type and BulkColumn FROM OPENROWSET, etc.

I'll let you know in the next day or so whether I have any success.

Thanks,

Andre Ranieri
Andre Ranieri
Andre Ranieri
Mr or Mrs. 500
Mr or Mrs. 500 (555 reputation)Mr or Mrs. 500 (555 reputation)Mr or Mrs. 500 (555 reputation)Mr or Mrs. 500 (555 reputation)Mr or Mrs. 500 (555 reputation)Mr or Mrs. 500 (555 reputation)Mr or Mrs. 500 (555 reputation)Mr or Mrs. 500 (555 reputation)

Group: General Forum Members
Points: 555 Visits: 379
The object is using the deprecated Text datatype, and reading the string directly from the database shows a text header like ÿØÿá6ªExif, which looks a lot like the EXIF headers if you read the contents of a jpeg file directly in notepad.

I can use a stream to copy the contents of one jpeg file to another. Can I use that same stream to write directly to the text field in the vendor's SQL database? If so, how would this be done?

Andre




string fromPath = @"c:\temp\test1.jpg";
string toPath = @"c:\temp\test2.jpg";

using (Stream source = File.OpenRead(fromPath))
using (Stream dest = File.Create(toPath))
{
byte[] buffer = new byte[1024];
int bytes;
while ((bytes = source.Read(buffer, 0, buffer.Length)) > 0)
{
dest.Write(buffer, 0, bytes);
}
}
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search