SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


HASHBYTES: Is CHECKSUM really required?


HASHBYTES: Is CHECKSUM really required?

Author
Message
GarySha
GarySha
Forum Newbie
Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)

Group: General Forum Members
Points: 3 Visits: 10
I found the issue with Null value.If we concatenate n columns in table1 and compare them with same type of concatenated n columns in table2 . If any column in table 1 is null, Hashbyte result of whole concatenated columns would be Null and binary_checksum result will be some default value. So first we have to use coalesce function or some other way to treat null value for columns. Then only we can take benefit of these comparison functions.
Ryan.Polk
Ryan.Polk
SSC Journeyman
SSC Journeyman (91 reputation)SSC Journeyman (91 reputation)SSC Journeyman (91 reputation)SSC Journeyman (91 reputation)SSC Journeyman (91 reputation)SSC Journeyman (91 reputation)SSC Journeyman (91 reputation)SSC Journeyman (91 reputation)

Group: General Forum Members
Points: 91 Visits: 574
HASHBYTES is meant for a literal string (or binary) import of one column, so any potentially null fields would need to be wrapped in ISNULL(x,'') - and that's assuming you want a NULL string to be treated the exact same as an empty string.

CHECKSUM and BINARY_CHECKSUM have multi-column input, as opposed to a single string value.
richardd
richardd
Hall of Fame
Hall of Fame (3.1K reputation)Hall of Fame (3.1K reputation)Hall of Fame (3.1K reputation)Hall of Fame (3.1K reputation)Hall of Fame (3.1K reputation)Hall of Fame (3.1K reputation)Hall of Fame (3.1K reputation)Hall of Fame (3.1K reputation)

Group: General Forum Members
Points: 3142 Visits: 648
Your query using the CHECKSUM index is incorrect:

SELECT * 
FROM dbo.PostalCode
WHERE PostalCheckSum = CHECKSUM(@PostalArea, @PostalCity, @PostalState);



As previously mentioned, the value returned from CHECKSUM will not be unique for the inputs. The same input will always produce the same output, but different input is not guaranteed to produce different output.

Therefore, your query still needs to test the other columns:

SELECT * 
FROM dbo.PostalCode
WHERE PostalCheckSum = CHECKSUM(@PostalArea, @PostalCity, @PostalState)
AND PostalArea = @PostalArea
AND PostalCity = @PostalCity
AND PostalState = @PostalState;





Nakul Vachhrajani
Nakul Vachhrajani
SSCrazy
SSCrazy (2.2K reputation)SSCrazy (2.2K reputation)SSCrazy (2.2K reputation)SSCrazy (2.2K reputation)SSCrazy (2.2K reputation)SSCrazy (2.2K reputation)SSCrazy (2.2K reputation)SSCrazy (2.2K reputation)

Group: General Forum Members
Points: 2222 Visits: 2143
suharis (3/27/2013)
HI Nakul, we are using CHECKSUM currently, not sure if this is used by Default features of database, but always our etls are failing with CHECKSUM causing I/O errors


Are you trying to insert a CHECKSUM value? It should be a calculated field (it can be persisted by creating an index over the CHECKSUM).

This is a great forum question - can you post it with supporting details on the SSC forums?

Thanks & Regards,
Nakul Vachhrajani.
http://nakulvachhrajani.com
Be courteous. Drive responsibly.

Follow me on
Twitter: @sqltwins
Google Plus: +Nakul
adam.everett
adam.everett
SSC Rookie
SSC Rookie (39 reputation)SSC Rookie (39 reputation)SSC Rookie (39 reputation)SSC Rookie (39 reputation)SSC Rookie (39 reputation)SSC Rookie (39 reputation)SSC Rookie (39 reputation)SSC Rookie (39 reputation)

Group: General Forum Members
Points: 39 Visits: 118
Sorry this is a very bad example. Checksum like you say is not unique, creating very many duplicates. So as a solution is rather pointless, if the user gets incorrect results back from the query. Yes it saved index space, but the end query result is wrong!

Maybe I missed the point of the example?
message rooster
message rooster
Forum Newbie
Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)

Group: General Forum Members
Points: 3 Visits: 8
Nakal, this is a bad article. A SELECT using the checksum is not guaranteed to return the same results as a search using the actual values. If you use this in an app the results will be unreliable and potentially this can go undetected since it will mostly be correct. Mostly correct is worse than useless.

I appreciate you would have meant well, but you should withdraw the post or fix it. If there is a use for the checksum index that is not flawed then by all means put that in instead. SScommitted showed a non-flawed query, so that would be usable. Not sure if that query is better with the checksum than without it, but if your post gave a comparison of the two, it would then be a worthwhile article.
bayazidahmed
bayazidahmed
Old Hand
Old Hand (332 reputation)Old Hand (332 reputation)Old Hand (332 reputation)Old Hand (332 reputation)Old Hand (332 reputation)Old Hand (332 reputation)Old Hand (332 reputation)Old Hand (332 reputation)

Group: General Forum Members
Points: 332 Visits: 238
HASHBYTES comes at an additional storage cost, but not always as high as 8000 bytes as mentioned in your post.
Like danny.heijl points out it is only 16 bytes for MD5 or 20 bytes for SHA1.
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search