Detect / Determine data stored in a varbinary field

  • I have several tables a varbinary column in a database. They have names like CSB_BLOB or OBJECT_BLOB. Now I am having intermittent success with getting the data out.

    For example this query returns readable text from this data. GREAT!

    0x46726F6D3A20226465616E6E6167726.....etc --data as stored in the column

    SELECT CAST(CSB_BLOB AS VARCHAR(MAX)) AS 'Message' FROM OBJECT_BLOB

    However this column has the following query results.

    0x0001000000FFFFFFFF01000000000000000C....etc. --data as stored in column

    --this query returns empty result

    SELECT (CSB_BLOB AS VARCHAR(MAX)) AS 'Message' FROM CSB_STATUS_LOG

    --this query returns no change???

    SELECT CONVERT(VARCHAR(MAX), CONVERT(VARBINARY(MAX), CSB_BLOB, 2), 2) FROM CSB_STATUS_LOG

    0001000000FFFFFFFF01000000000000000C....etc

    Obviously there is a difference between the two but I am not educated enough to interpret this difference. What do I need to learn / read so I can look at the data in one of these BLOB columns and know how to convert it to something meaningful?

    If you can't really know just from looking what are all the different conversion I need to try? That in and of itself seems like an impossible question since just about anything can be converted to binary and stored so....

    As far as the different conversion I am not asking you to write my TSQL for me but instead just tell me the common conversions.

    Something like:

    1. Try to cast as varchar to see if it is text.

    2. Turn into a byte array and see if it is a jpg

    3. Turn into a byte array and see if it is a pdf

    4. Convert it to hex and then cast as varchar

    5. etc....

    Thank You

  • You can try converting the string to varchar or nvarchar to see if meaningful text comes out, but that assumes meaningful text was put in in the first place. If the column can be a text string, a jpg or a pdf, how to retrieve and interpret that data should be included when the data is stored down (and arguably not even in the same column).

    Beyond checking if it casts to a varchar or nvarchar, I would think you'd need to write a CLR or an application to check the contents of each varbinary value against the possible expected types by brute force.

    I think the real issue is to know what is getting store down in the first place.

    Executive Junior Cowboy Developer, Esq.[/url]

  • Quick thought, many file formats have a signature or a magical number at the start of the file which is unique for that format. There are quit few resources available such as File Signature Database, File Format Identification (forensics) and FileExtension.org[/url].

    If the first few bytes are all equal or higher than 32 / 0x20, then it is most likely a text file of some sort.

    😎

  • Looks like it's "5. etc....", numbers packed by some app somehow and only those app developer can tell you which way a BLOB should be decoded.

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply