Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase «««678910»»

HASHBYTES can help quickly load a Data Warehouse Expand / Collapse
Author
Message
Posted Wednesday, February 23, 2011 6:08 AM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Monday, June 18, 2012 4:11 AM
Points: 27, Visits: 219
Hey guys,

Check out this website, will probably solve your issues with Hash Bytes within SSIS:

http://ssismhash.codeplex.com/

God Bless.
Post #1068177
Posted Monday, April 25, 2011 3:52 PM
SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Thursday, July 17, 2014 10:36 AM
Points: 416, Visits: 523
How do you determine what size to make your VARBINARY field to hold the results of the HASHBYTES function?


Is it simply a matter of setting it to be equal to the largest HashByte function input value?

DECLARE @vHashResults AS VARBINARY(40)
DECLARE @vHashInput01 AS CHAR(20)
DECLARE @vHashInput02 AS CHAR(20)

SET @vHashInput01 = 'thisisa20bytechar123'
SET @vHashInput02 = 'thisisa21bytechar1234'

SET @vHashResults = HashBytes('SHA1', @vHashInput01 + @vHashInput02)


Post #1098333
Posted Tuesday, April 26, 2011 12:08 PM


SSCrazy Eights

SSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy Eights

Group: General Forum Members
Last Login: Today @ 2:23 PM
Points: 8,562, Visits: 9,059
wta306 (4/25/2011)
How do you determine what size to make your VARBINARY field to hold the results of the HASHBYTES function?


Is it simply a matter of setting it to be equal to the largest HashByte function input value?

DECLARE @vHashResults AS VARBINARY(40)
DECLARE @vHashInput01 AS CHAR(20)
DECLARE @vHashInput02 AS CHAR(20)

SET @vHashInput01 = 'thisisa20bytechar123'
SET @vHashInput02 = 'thisisa21bytechar1234'

SET @vHashResults = HashBytes('SHA1', @vHashInput01 + @vHashInput02)


I would be somewhat horrified if HashBytes('SHA1', <anything>) delived a result with any length other than 160 bits (20 bytes)! That's what MS thinks it delivers too, according to BoL.


Tom
Post #1098891
Posted Wednesday, April 27, 2011 6:20 AM
SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Thursday, July 17, 2014 10:36 AM
Points: 416, Visits: 523
Tom.Thomson (4/26/2011)
wta306 (4/25/2011)
How do you determine what size to make your VARBINARY field to hold the results of the HASHBYTES function?


Is it simply a matter of setting it to be equal to the largest HashByte function input value?

DECLARE @vHashResults AS VARBINARY(40)
DECLARE @vHashInput01 AS CHAR(20)
DECLARE @vHashInput02 AS CHAR(20)

SET @vHashInput01 = 'thisisa20bytechar123'
SET @vHashInput02 = 'thisisa21bytechar1234'

SET @vHashResults = HashBytes('SHA1', @vHashInput01 + @vHashInput02)


I would be somewhat horrified if HashBytes('SHA1', <anything>) delived a result with any length other than 160 bits (20 bytes)! That's what MS thinks it delivers too, according to BoL.


Tom - thank you for the response.

Unless I'm mistaken, MD5 is only 128 bits (16bytes)...and yet the author of the original article stores the resulting values in a VARBINARY(100)? What am I missing?
Post #1099358
Posted Wednesday, April 27, 2011 7:59 AM


SSCrazy Eights

SSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy Eights

Group: General Forum Members
Last Login: Today @ 2:23 PM
Points: 8,562, Visits: 9,059
wta306 (4/27/2011)
Unless I'm mistaken, MD5 is only 128 bits (16bytes)...and yet the author of the original article stores the resulting values in a VARBINARY(100)? What am I missing?

Yes, MD5 is 16 bytes. I suspect VARBINARY(100) was used because it will cater for using SHA1 (20 bytes) and, when SHA2 family of hash functions (SHA256 - 32 bytes, SHA512 - 64 bytes, SHA224 -28 bytes, and SHA384 - 48 bytes) are supported VARBINARY(100) will cater for all of them; and maybe it will even cater for SHA3. Presumably we'll know when NIST announces the winner of the SHA3 design contest next year; I'm pretty sure I've seen something about lengths for SHA3, so maybe we could know already, but I don't recall what so maybe not; I suspect VARBINARY(100) may turn out to be too short to cover all variants of SHA3 (but it will be years before SQL Server supports SHA3).

To be certain why VARBINARY(100) was used in the article we'll have to hope for a response from the author; but I'm reasonably happy with my guess that it was intended to cope with known future growth in hash sizes (to 512 bits) and have a little bit to spare. After all, it has to be at least VARBINARY(20) not BINARY(16) to cope with all hashes supported by SQL Server 2008 unless one uses different data type depending on which hash is being used (which would make changing from one hash to another twice as hard) and there's no extra overhead in using VARBINART(64) or VARBINARY(100) so that it can cover SHA2 instead of restricting it to MDn and SHA1 by using VARBINARY(20). If I'd been writing something about hashing and wanted to cover future possibilities I might even have chosen VARBINARY(256) in the hope of catering for SHA4 when/if that happens!


Tom
Post #1099464
Posted Wednesday, April 27, 2011 8:45 AM
SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Thursday, July 17, 2014 10:36 AM
Points: 416, Visits: 523
Tom.Thomson (4/27/2011)
wta306 (4/27/2011)
Unless I'm mistaken, MD5 is only 128 bits (16bytes)...and yet the author of the original article stores the resulting values in a VARBINARY(100)? What am I missing?

Yes, MD5 is 16 bytes. I suspect VARBINARY(100) was used because it will cater for using SHA1 (20 bytes) and, when SHA2 family of hash functions (SHA256 - 32 bytes, SHA512 - 64 bytes, SHA224 -28 bytes, and SHA384 - 48 bytes) are supported VARBINARY(100) will cater for all of them; and maybe it will even cater for SHA3. Presumably we'll know when NIST announces the winner of the SHA3 design contest next year; I'm pretty sure I've seen something about lengths for SHA3, so maybe we could know already, but I don't recall what so maybe not; I suspect VARBINARY(100) may turn out to be too short to cover all variants of SHA3 (but it will be years before SQL Server supports SHA3).

To be certain why VARBINARY(100) was used in the article we'll have to hope for a response from the author; but I'm reasonably happy with my guess that it was intended to cope with known future growth in hash sizes (to 512 bits) and have a little bit to spare. After all, it has to be at least VARBINARY(20) not BINARY(16) to cope with all hashes supported by SQL Server 2008 unless one uses different data type depending on which hash is being used (which would make changing from one hash to another twice as hard) and there's no extra overhead in using VARBINART(64) or VARBINARY(100) so that it can cover SHA2 instead of restricting it to MDn and SHA1 by using VARBINARY(20). If I'd been writing something about hashing and wanted to cover future possibilities I might even have chosen VARBINARY(256) in the hope of catering for SHA4 when/if that happens!


wow, ok - that makes a ton of sense. thank you!!
Post #1099527
Posted Wednesday, April 27, 2011 11:04 PM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Monday, July 21, 2014 4:44 AM
Points: 44, Visits: 237
Hi,

We use SHA1 and store the result as Varbinary(MAX).
Do you see any problems with this ?

Thanks
Post #1099970
Posted Thursday, April 28, 2011 5:21 AM


SSCrazy Eights

SSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy Eights

Group: General Forum Members
Last Login: Today @ 2:23 PM
Points: 8,562, Visits: 9,059
Cyclone (4/27/2011)
Hi,

We use SHA1 and store the result as Varbinary(MAX).
Do you see any problems with this ?

Thanks

Yes, there is a problem: a varbinary(max) column can't be a key column in an index, so if you want to use your hash in an index you shouldn't use varbinary(max). That's the only problem I can think of.


Tom
Post #1100130
Posted Saturday, October 8, 2011 9:03 PM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Friday, July 18, 2014 2:47 PM
Points: 40, Visits: 451
Has anyone come across cases where SHA1 has produced output longer than 20 bytes, because I think I have. We are using HASHBYTES to detect changes in our data warehouse, and we were getting type 2 changes, even though all the columns values were exactly the same between the most current and previous rows; even the hashbyte value looked the same!

But, upon looking closer, the HASHBYTE function was returning this: 0xEE0816119692A1C4DCC2045EA1E37CD47EFF4900
But, the value in our varbinary(20) column in the table was storing this: 0xEE0816119692A1C4DCC2045EA1E37CD47EFF49

Notice the two zeros at the end of the function's output. For some reason, when SQL writes the output from the function to the table it truncates those last two "characters" at the end!

We have no idea why it does that, but when we increase the varbinary column in the table from 20 to 40, SQL doesn't truncate the output anymore, and all is well.

Anyway, if anyone has any insight on why SQL seems to be truncating the output from the function as it writes to the varbinary(20) column, please let me know, I'm curious!

L' Eomot Inversé (4/26/2011)
wta306 (4/25/2011)
How do you determine what size to make your VARBINARY field to hold the results of the HASHBYTES function?


Is it simply a matter of setting it to be equal to the largest HashByte function input value?

DECLARE @vHashResults AS VARBINARY(40)
DECLARE @vHashInput01 AS CHAR(20)
DECLARE @vHashInput02 AS CHAR(20)

SET @vHashInput01 = 'thisisa20bytechar123'
SET @vHashInput02 = 'thisisa21bytechar1234'

SET @vHashResults = HashBytes('SHA1', @vHashInput01 + @vHashInput02)


I would be somewhat horrified if HashBytes('SHA1', <anything>) delived a result with any length other than 160 bits (20 bytes)! That's what MS thinks it delivers too, according to BoL.
Post #1187593
Posted Sunday, October 9, 2011 9:38 AM


SSCrazy Eights

SSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy Eights

Group: General Forum Members
Last Login: Today @ 2:23 PM
Points: 8,562, Visits: 9,059
Langston Montgomery (10/8/2011)
Has anyone come across cases where SHA1 has produced output longer than 20 bytes, because I think I have. We are using HASHBYTES to detect changes in our data warehouse, and we were getting type 2 changes, even though all the columns values were exactly the same between the most current and previous rows; even the hashbyte value looked the same!

But, upon looking closer, the HASHBYTE function was returning this: 0xEE0816119692A1C4DCC2045EA1E37CD47EFF4900
But, the value in our varbinary(20) column in the table was storing this: 0xEE0816119692A1C4DCC2045EA1E37CD47EFF49
I don't think "longer that 20 bytes" is relevant, as the number ending 00 is 20 bytes (the one without 00 is only 19).

Notice the two zeros at the end of the function's output. For some reason, when SQL writes the output from the function to the table it truncates those last two "characters" at the end!

We have no idea why it does that, but when we increase the varbinary column in the table from 20 to 40, SQL doesn't truncate the output anymore, and all is well.

Anyway, if anyone has any insight on why SQL seems to be truncating the output from the function as it writes to the varbinary(20) column, please let me know, I'm curious!

I tried a few things to see if I could reproduce this behaviour, using hashbytes to generate values for local variables, for columns in table variable, for columns in temp tables, and for columns in premanent tables, but got no truncation ever when using varchar(20). Maybe I'm on a different version - I'm using SQL 2008 R2.


Tom
Post #1187641
« Prev Topic | Next Topic »

Add to briefcase «««678910»»

Permissions Expand / Collapse