Using the CheckSum Function - SQL School Video

  • Comments posted to this topic are about the item Using the CheckSum Function - SQL School Video

  • this is a good/ dangerous technique. I have implemented this is a production system and got burned by duplicate checksum values on different columns (hash collision). Them more data you can put into the check sum the less likely you will have duplicates. maybe.

    But good demo.

  • That's true. There's a slight chance of duplicate checksums. Another technique that I've used is HASHBYTES similar to this:

    select HASHBYTES('md5',NAME + isnull(Color,'Unkown')), * FROM Production.Product

    It takes longer to run but it produces very unique values like 0x313FB214C93591081E720123253B1398.

  • Nice one...

  • Brian,

    That's an excellent tip!

    Wee're in the middle of a conversion project for HR & FIN and this will absolutely help.

    Great Job!

  • you should idd use MD5!

    using checksum results easily in duplicates , eg

    select checksum('eeeeeeeeeeeeeeee')

    select checksum('dddddddddddddddd')

  • Agreed, having worked on a system where someone decided to implement checksums to manage change control I know from bitter experience that it will bite you at some point. I think we found numbers that were related in some mathematical way (I forget exactly) would give the same value. But we also had examples of company name strings that gave the same value.

    If getting the integrity perfect is important I'd suggest not using this idea. For something that is supposed to be a rare occurrence it happened surprisingly frequently!

  • When implementing the HashBytes or CheckSum is it optimal to do these checks manually in the respective update stored procedure, or is it better to be used trigger based? What is the best practice?

    My goal is to provide auditing (stored in a separate audit table) in sql server 2005 when a specific piece of data e.g. username or users password is modified, when the enter user row is updated.

  • First of all kudos to Brian for his videos. I absolutely love them and have learned so much. Secondly I agree with the others regarding the dangers of using CHECKSUM against string values due to the non-uniqueness of the result. I think it's fairly safe to use across several numeric/date columns though. I have never used HASHBYTES before, so this is a great tip as well.

    Thank you!

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply