• wta306 (4/27/2011)


    Unless I'm mistaken, MD5 is only 128 bits (16bytes)...and yet the author of the original article stores the resulting values in a VARBINARY(100)? What am I missing?

    Yes, MD5 is 16 bytes. I suspect VARBINARY(100) was used because it will cater for using SHA1 (20 bytes) and, when SHA2 family of hash functions (SHA256 - 32 bytes, SHA512 - 64 bytes, SHA224 -28 bytes, and SHA384 - 48 bytes) are supported VARBINARY(100) will cater for all of them; and maybe it will even cater for SHA3. Presumably we'll know when NIST announces the winner of the SHA3 design contest next year; I'm pretty sure I've seen something about lengths for SHA3, so maybe we could know already, but I don't recall what so maybe not; I suspect VARBINARY(100) may turn out to be too short to cover all variants of SHA3 (but it will be years before SQL Server supports SHA3).

    To be certain why VARBINARY(100) was used in the article we'll have to hope for a response from the author; but I'm reasonably happy with my guess that it was intended to cope with known future growth in hash sizes (to 512 bits) and have a little bit to spare. After all, it has to be at least VARBINARY(20) not BINARY(16) to cope with all hashes supported by SQL Server 2008 unless one uses different data type depending on which hash is being used (which would make changing from one hash to another twice as hard) and there's no extra overhead in using VARBINART(64) or VARBINARY(100) so that it can cover SHA2 instead of restricting it to MDn and SHA1 by using VARBINARY(20). If I'd been writing something about hashing and wanted to cover future possibilities I might even have chosen VARBINARY(256) in the hope of catering for SHA4 when/if that happens!

    Tom