Click here to monitor SSC
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


HASHBYTES: Is CHECKSUM really required?


HASHBYTES: Is CHECKSUM really required?

Author
Message
Nakul Vachhrajani
Nakul Vachhrajani
SSCommitted
SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)

Group: General Forum Members
Points: 1824 Visits: 2127
Comments posted to this topic are about the item HASHBYTES: Is CHECKSUM really required?

Thanks & Regards,
Nakul Vachhrajani.
http://nakulvachhrajani.com
Be courteous. Drive responsibly.

Follow me on
Twitter: @sqltwins
Google Plus: +Nakul
Alex Fekken
Alex Fekken
SSC Veteran
SSC Veteran (215 reputation)SSC Veteran (215 reputation)SSC Veteran (215 reputation)SSC Veteran (215 reputation)SSC Veteran (215 reputation)SSC Veteran (215 reputation)SSC Veteran (215 reputation)SSC Veteran (215 reputation)

Group: General Forum Members
Points: 215 Visits: 460
Thanks for the article Nakul.

I have a question about HASHBYTES (and hashing in general) to which I have found it impossible so far to find a definite answer. Perhaps you know: you list HASHBYTES as providing change detection, but is that really true?

In a lot of places you see claims that "any change" will affect the hash value but that is not true: it is easy to see that for any (finite) hash algorithm there are always two files (or datablobs) that are the same size and differ by only two bits and that will give the same hash value. But it is not so obvious (at least to me) if there can be two files/datablobs that differ by only one bit and that will have the same hash value. Of course I realise that from a practical point of view it is extremely unlikely that any two files/datablobs will have the same hash value but I am interested in the theoretical question: will a single bit change always change the hash value?

Thanks,
Alex
the_huge_bear
the_huge_bear
Forum Newbie
Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)

Group: General Forum Members
Points: 3 Visits: 44
Hi guys,

Thanks a lot for your publication, just few words to advise.

I use this approch, and I meet many problem with the checksum process.

So if you have some Null value in the key the checksum failed or return an "strange" value.
So take care to controle all fields have a value and best way be not null, and to reduce failure of cheksum process, i add a "|" between each fields, add an IsNull function 'In case Of".

Regards
sqlchan
sqlchan
Forum Newbie
Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)

Group: General Forum Members
Points: 6 Visits: 22
Is CHECKSUM guarenteed to be unique? Can different combinations of data give the same CHECKSUM value?
danny.heijl
danny.heijl
Forum Newbie
Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)

Group: General Forum Members
Points: 3 Visits: 47
A checksum is 4 bytes, with very little chance for uniqueness, and is very easily tampered with.

Hashbytes is a 16 bytes (for MD5) or 20 bytes (for SHA1) hash value calculated over a maximum of 8000 input bytes, with a pretty good guarantee for uniqueness and in the case of SHA1 also very difficult to tamper with..

So HashBytes (SHA1) is the right choice in my opinion.
gary.strange-sqlconsumer
gary.strange-sqlconsumer
SSC Journeyman
SSC Journeyman (75 reputation)SSC Journeyman (75 reputation)SSC Journeyman (75 reputation)SSC Journeyman (75 reputation)SSC Journeyman (75 reputation)SSC Journeyman (75 reputation)SSC Journeyman (75 reputation)SSC Journeyman (75 reputation)

Group: General Forum Members
Points: 75 Visits: 610
Good Article Nakul,

But it would have benefited immensely from a paragraph on fault tolerance.

Microsoft's msdn page refers to it explicitly...
http://msdn.microsoft.com/en-gb/library/ms189788.aspx

As table rows increase the chance of the "Birthday Paradox" increases and as the strategy is to reduce index size and therefore search times. I would imagine you would be targeting a table with a large number of rows. So perhaps a counter productive strategy.

Agreeably tables with a relatively low number of rows but a very wide search criteria may benefit from the CHECKSUM strategy.
gary.strange-sqlconsumer
gary.strange-sqlconsumer
SSC Journeyman
SSC Journeyman (75 reputation)SSC Journeyman (75 reputation)SSC Journeyman (75 reputation)SSC Journeyman (75 reputation)SSC Journeyman (75 reputation)SSC Journeyman (75 reputation)SSC Journeyman (75 reputation)SSC Journeyman (75 reputation)

Group: General Forum Members
Points: 75 Visits: 610
Good Article Nakul,

But it would have benefited immensely from a paragraph on fault tolerance.

Microsoft's msdn page refers to it explicitly...
http://msdn.microsoft.com/en-gb/library/ms189788.aspx

As table rows increase the chance of the "Birthday Paradox" increases and as the strategy is to reduce index size and therefore search times. I would imagine you would be targeting a table with a large number of rows. So perhaps a counter productive strategy.

Agreeably tables with a relatively low number of rows but a very wide search criteria may benefit from the CHECKSUM strategy.
gary.strange-sqlconsumer
gary.strange-sqlconsumer
SSC Journeyman
SSC Journeyman (75 reputation)SSC Journeyman (75 reputation)SSC Journeyman (75 reputation)SSC Journeyman (75 reputation)SSC Journeyman (75 reputation)SSC Journeyman (75 reputation)SSC Journeyman (75 reputation)SSC Journeyman (75 reputation)

Group: General Forum Members
Points: 75 Visits: 610
Good Article Nakul,

But it would have benefited immensely from a paragraph on fault tolerance.

Microsoft's msdn page refers to it explicitly...
http://msdn.microsoft.com/en-gb/library/ms189788.aspx

As table rows increase the chance of the "Birthday Paradox" increases and as the strategy is to reduce index size and therefore search times. I would imagine you would be targeting a table with a large number of rows. So perhaps a counter productive strategy.

Agreeably tables with a relatively low number of rows but a very wide search criteria may benefit from the CHECKSUM strategy.
Nakul Vachhrajani
Nakul Vachhrajani
SSCommitted
SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)

Group: General Forum Members
Points: 1824 Visits: 2127
Thank-you, all for your time and valuable feedback.

Yes, CHECKSUM fails to detect symmetric changes and has therefore been listed only as a "basic" change detection mechanism. The "ideal" one would be HASHBYTES, but it comes at an additional cost - storage.

As mentioned in the article, I would request the kind reader to please go through my in-depth study on the various change and tamper detection mechanisms available where I attempt to evaluate each of the pros & cons of the various methods (Part 01 and Part 02) .

Thanks & Regards,
Nakul Vachhrajani.
http://nakulvachhrajani.com
Be courteous. Drive responsibly.

Follow me on
Twitter: @sqltwins
Google Plus: +Nakul
suharis
suharis
Forum Newbie
Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)

Group: General Forum Members
Points: 3 Visits: 3
HI Nakul, we are using CHECKSUM currently, not sure if this is used by Default features of database, but always our etls are failing with CHECKSUM causing I/O errors
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search