SQL Clone
SQLServerCentral is supported by Redgate
Log in  ::  Register  ::  Not logged in

SQL Server Encryption - Hashing Limitations

As I'm digging more into SQL Server 2008 encryption I'm learning some interesting things about how SQL Server handles encryption. Recently I was doing some research on the HASHBYTES function, which performs a one-way hash on data. This is useful in terms of validating secure passwords or determining is some data has changed. You do this by comparing the hash values, without necessarily needing to compare the actual data.

As I was reading about this function and researching the SHA-1 algorithm (one of a couple that can be used), I was surprised to find out that while the SHA-1 algorithm can handle 2^64 - 1 bytes of data, and SQL Server's text 2^32-1 bytes, this function is limited to 8,000 bytes of data.

I could understand back when we had 2k page sizes, or even the first evolution of 8k pages that not all functions would handle more text data, but isn't time in SQL Server 2008 that all functions that can operate on text data can handle more than 8k worth of data?

It seems silly these days, with more and more data being stored in SQL Server, and larger and larger sizes, wouldn't there be the need for functions to work with more than 8k of data?

Now HASHBYTES works on varbinary data, so character data gets converted (or needs to be), but there are definitely places where you might want to use this with binary data as well. Consider someone storing audio files in SQL Server, which definitely might exceed 8kb. Imagine that the end of a  recording is changed, something beyond the 8kb point. If two hashes were compared, they would be the same since this function truncates the data at 8kb.

What if someone wanted to use hashing to audit this data for changes?

It's time that SQL Server remove the 8k limits for data throughout the product. I'm hoping that SQL 11, or SQL 12 for sure, will do this. There are ways around this now (work in 8k chunks), but that's cumbersome and unnecessary. If we can do it with a UDF and a series of calls, can't the engine do it easier?

In the meantime, if you use hashing functions, be aware of this limitation.

The Voice of the DBA

Steve Jones is the editor of SQLServerCentral.com and visits a wide variety of data related topics in his daily editorial. Steve has spent years working as a DBA and general purpose Windows administrator, primarily working with SQL Server since it was ported from Sybase in 1990. You can follow Steve on Twitter at twitter.com/way0utwest


Posted by John Magnabosco on 18 May 2009

Steve, check out my blog entries on related topics: Encrypting Large Values (http://tinyurl.com/o7dyed) and Observing Hash Variance (http://tinyurl.com/r9vt8x)

Hopefully you'll find some of my other topics helpful as well in your journey into encryption.

Leave a Comment

Please register or log in to leave a comment.