SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


HASHBYTES: Is CHECKSUM really required?


HASHBYTES: Is CHECKSUM really required?

Author
Message
Nakul Vachhrajani
Nakul Vachhrajani
Hall of Fame
Hall of Fame (3.3K reputation)Hall of Fame (3.3K reputation)Hall of Fame (3.3K reputation)Hall of Fame (3.3K reputation)Hall of Fame (3.3K reputation)Hall of Fame (3.3K reputation)Hall of Fame (3.3K reputation)Hall of Fame (3.3K reputation)

Group: General Forum Members
Points: 3325 Visits: 2149
Comments posted to this topic are about the item HASHBYTES: Is CHECKSUM really required?

Thanks & Regards,
Nakul Vachhrajani.
http://nakulvachhrajani.com
Be courteous. Drive responsibly.

Follow me on
Twitter: @sqltwins
Google Plus: +Nakul
Alex Fekken
Alex Fekken
Old Hand
Old Hand (385 reputation)Old Hand (385 reputation)Old Hand (385 reputation)Old Hand (385 reputation)Old Hand (385 reputation)Old Hand (385 reputation)Old Hand (385 reputation)Old Hand (385 reputation)

Group: General Forum Members
Points: 385 Visits: 460
Thanks for the article Nakul.

I have a question about HASHBYTES (and hashing in general) to which I have found it impossible so far to find a definite answer. Perhaps you know: you list HASHBYTES as providing change detection, but is that really true?

In a lot of places you see claims that "any change" will affect the hash value but that is not true: it is easy to see that for any (finite) hash algorithm there are always two files (or datablobs) that are the same size and differ by only two bits and that will give the same hash value. But it is not so obvious (at least to me) if there can be two files/datablobs that differ by only one bit and that will have the same hash value. Of course I realise that from a practical point of view it is extremely unlikely that any two files/datablobs will have the same hash value but I am interested in the theoretical question: will a single bit change always change the hash value?

Thanks,
Alex
the_huge_bear
the_huge_bear
SSC Rookie
SSC Rookie (31 reputation)SSC Rookie (31 reputation)SSC Rookie (31 reputation)SSC Rookie (31 reputation)SSC Rookie (31 reputation)SSC Rookie (31 reputation)SSC Rookie (31 reputation)SSC Rookie (31 reputation)

Group: General Forum Members
Points: 31 Visits: 44
Hi guys,

Thanks a lot for your publication, just few words to advise.

I use this approch, and I meet many problem with the checksum process.

So if you have some Null value in the key the checksum failed or return an "strange" value.
So take care to controle all fields have a value and best way be not null, and to reduce failure of cheksum process, i add a "|" between each fields, add an IsNull function 'In case Of".

Regards
sqlchan
sqlchan
Grasshopper
Grasshopper (20 reputation)Grasshopper (20 reputation)Grasshopper (20 reputation)Grasshopper (20 reputation)Grasshopper (20 reputation)Grasshopper (20 reputation)Grasshopper (20 reputation)Grasshopper (20 reputation)

Group: General Forum Members
Points: 20 Visits: 22
Is CHECKSUM guarenteed to be unique? Can different combinations of data give the same CHECKSUM value?
danny.heijl
danny.heijl
Grasshopper
Grasshopper (15 reputation)Grasshopper (15 reputation)Grasshopper (15 reputation)Grasshopper (15 reputation)Grasshopper (15 reputation)Grasshopper (15 reputation)Grasshopper (15 reputation)Grasshopper (15 reputation)

Group: General Forum Members
Points: 15 Visits: 47
A checksum is 4 bytes, with very little chance for uniqueness, and is very easily tampered with.

Hashbytes is a 16 bytes (for MD5) or 20 bytes (for SHA1) hash value calculated over a maximum of 8000 input bytes, with a pretty good guarantee for uniqueness and in the case of SHA1 also very difficult to tamper with..

So HashBytes (SHA1) is the right choice in my opinion.
gary.strange-sqlconsumer
gary.strange-sqlconsumer
Old Hand
Old Hand (398 reputation)Old Hand (398 reputation)Old Hand (398 reputation)Old Hand (398 reputation)Old Hand (398 reputation)Old Hand (398 reputation)Old Hand (398 reputation)Old Hand (398 reputation)

Group: General Forum Members
Points: 398 Visits: 627
Good Article Nakul,

But it would have benefited immensely from a paragraph on fault tolerance.

Microsoft's msdn page refers to it explicitly...
http://msdn.microsoft.com/en-gb/library/ms189788.aspx

As table rows increase the chance of the "Birthday Paradox" increases and as the strategy is to reduce index size and therefore search times. I would imagine you would be targeting a table with a large number of rows. So perhaps a counter productive strategy.

Agreeably tables with a relatively low number of rows but a very wide search criteria may benefit from the CHECKSUM strategy.
gary.strange-sqlconsumer
gary.strange-sqlconsumer
Old Hand
Old Hand (398 reputation)Old Hand (398 reputation)Old Hand (398 reputation)Old Hand (398 reputation)Old Hand (398 reputation)Old Hand (398 reputation)Old Hand (398 reputation)Old Hand (398 reputation)

Group: General Forum Members
Points: 398 Visits: 627
Good Article Nakul,

But it would have benefited immensely from a paragraph on fault tolerance.

Microsoft's msdn page refers to it explicitly...
http://msdn.microsoft.com/en-gb/library/ms189788.aspx

As table rows increase the chance of the "Birthday Paradox" increases and as the strategy is to reduce index size and therefore search times. I would imagine you would be targeting a table with a large number of rows. So perhaps a counter productive strategy.

Agreeably tables with a relatively low number of rows but a very wide search criteria may benefit from the CHECKSUM strategy.
gary.strange-sqlconsumer
gary.strange-sqlconsumer
Old Hand
Old Hand (398 reputation)Old Hand (398 reputation)Old Hand (398 reputation)Old Hand (398 reputation)Old Hand (398 reputation)Old Hand (398 reputation)Old Hand (398 reputation)Old Hand (398 reputation)

Group: General Forum Members
Points: 398 Visits: 627
Good Article Nakul,

But it would have benefited immensely from a paragraph on fault tolerance.

Microsoft's msdn page refers to it explicitly...
http://msdn.microsoft.com/en-gb/library/ms189788.aspx

As table rows increase the chance of the "Birthday Paradox" increases and as the strategy is to reduce index size and therefore search times. I would imagine you would be targeting a table with a large number of rows. So perhaps a counter productive strategy.

Agreeably tables with a relatively low number of rows but a very wide search criteria may benefit from the CHECKSUM strategy.
Nakul Vachhrajani
Nakul Vachhrajani
Hall of Fame
Hall of Fame (3.3K reputation)Hall of Fame (3.3K reputation)Hall of Fame (3.3K reputation)Hall of Fame (3.3K reputation)Hall of Fame (3.3K reputation)Hall of Fame (3.3K reputation)Hall of Fame (3.3K reputation)Hall of Fame (3.3K reputation)

Group: General Forum Members
Points: 3325 Visits: 2149
Thank-you, all for your time and valuable feedback.

Yes, CHECKSUM fails to detect symmetric changes and has therefore been listed only as a "basic" change detection mechanism. The "ideal" one would be HASHBYTES, but it comes at an additional cost - storage.

As mentioned in the article, I would request the kind reader to please go through my in-depth study on the various change and tamper detection mechanisms available where I attempt to evaluate each of the pros & cons of the various methods (Part 01 and Part 02) .

Thanks & Regards,
Nakul Vachhrajani.
http://nakulvachhrajani.com
Be courteous. Drive responsibly.

Follow me on
Twitter: @sqltwins
Google Plus: +Nakul
suharis
suharis
Grasshopper
Grasshopper (17 reputation)Grasshopper (17 reputation)Grasshopper (17 reputation)Grasshopper (17 reputation)Grasshopper (17 reputation)Grasshopper (17 reputation)Grasshopper (17 reputation)Grasshopper (17 reputation)

Group: General Forum Members
Points: 17 Visits: 3
HI Nakul, we are using CHECKSUM currently, not sure if this is used by Default features of database, but always our etls are failing with CHECKSUM causing I/O errors
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search