Click here to monitor SSC
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


checksums and unicode data


checksums and unicode data

Author
Message
RBarryYoung
RBarryYoung
SSChampion
SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)

Group: General Forum Members
Points: 10320 Visits: 9517
Jeff Moden (8/9/2009)
Paul White (8/8/2009)
HashBytes (perhaps including the PK as a salt) is a good choice if the task is to detect changes.
CHECKSUM is quick and efficient for creating hash indexes - which are awesome if you need to search long strings (for example).


Even then, it can and will give an incorrect answer if you rely only on it. I'll admit that CHECKSUM will help narrow things down on long strings, though.

I'm, going to have to disagree with this one, Jeff, at least as far as HashBytes using the SHA1 algorithm is concerned. Assuming you did a Trillion HashByte comparisons of different data a day (and no one on earth is doing that many), it would still take about 6000 to 12000 Years before you were likely to get an accidental match. Those are odds that I can live with. :-)

-- RBarryYoung, (302)375-0451 blog: MovingSQL.com, Twitter: @RBarryYoung
Proactive Performance Solutions, Inc.
"Performance is our middle name."
Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (54K reputation)SSC Guru (54K reputation)SSC Guru (54K reputation)SSC Guru (54K reputation)SSC Guru (54K reputation)SSC Guru (54K reputation)SSC Guru (54K reputation)SSC Guru (54K reputation)

Group: General Forum Members
Points: 54257 Visits: 40390
Ummmm.... does SQL Server's CHECKSUM use the SHA1 algorithm because I was talking about that CHECKSUM. Obviously that doesn't take 6000 to 12000 years to happen because I've seen two of these same types of posts in the last 3 months. ;-)

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
RBarryYoung
RBarryYoung
SSChampion
SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)

Group: General Forum Members
Points: 10320 Visits: 9517
Jeff Moden (8/10/2009)
Ummmm.... does SQL Server's CHECKSUM use the SHA1 algorithm because I was talking about that CHECKSUM. Obviously that doesn't take 6000 to 12000 years to happen because I've seen two of these same types of posts in the last 3 months. ;-)

Oh no, sorry, CHECKSUM produces duplicates if you so much as swap the order of two of the characters. So it stinks for this purpose, and we are in agreement on that. :-D

-- RBarryYoung, (302)375-0451 blog: MovingSQL.com, Twitter: @RBarryYoung
Proactive Performance Solutions, Inc.
"Performance is our middle name."
RBarryYoung
RBarryYoung
SSChampion
SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)

Group: General Forum Members
Points: 10320 Visits: 9517
To clarify: HashBytes w/ SHA1 is nigh-unbreakable.

-- RBarryYoung, (302)375-0451 blog: MovingSQL.com, Twitter: @RBarryYoung
Proactive Performance Solutions, Inc.
"Performance is our middle name."
Paul White
Paul White
SSChampion
SSChampion (11K reputation)SSChampion (11K reputation)SSChampion (11K reputation)SSChampion (11K reputation)SSChampion (11K reputation)SSChampion (11K reputation)SSChampion (11K reputation)SSChampion (11K reputation)

Group: General Forum Members
Points: 11448 Visits: 11354
RBarryYoung (8/10/2009)
To clarify: HashBytes w/ SHA1 is nigh-unbreakable.

Hey Barry,

On the other hand, it is relatively slow, only works on strings, and returns varbinary(8000).
If the task is to detect changes in a row of data, would you use HashBytes (SHA1) alone?

Paul



Paul White
SQLPerformance.com
SQLblog.com
@SQL_Kiwi
Lowell
Lowell
SSCoach
SSCoach (18K reputation)SSCoach (18K reputation)SSCoach (18K reputation)SSCoach (18K reputation)SSCoach (18K reputation)SSCoach (18K reputation)SSCoach (18K reputation)SSCoach (18K reputation)

Group: General Forum Members
Points: 18636 Visits: 39436
just looking for confirmation...i used my google-fu and found info and examples for using hashbytes with SHA1, ie

select HASHBYTES('SHA1','My Plain Text')
--results
0x6D99DDF6FE7A32547B6766E0BF88B1F50835F0FF



everything i read says that this is a one way operation, so you can use it to generate a unique value like checksum, but you cannot unhash this back to it's original value, right?

so this is not encryption, but rather a tool to generate a unique identifier, correct?

Lowell

--
help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (54K reputation)SSC Guru (54K reputation)SSC Guru (54K reputation)SSC Guru (54K reputation)SSC Guru (54K reputation)SSC Guru (54K reputation)SSC Guru (54K reputation)SSC Guru (54K reputation)

Group: General Forum Members
Points: 54257 Visits: 40390
Paul White (8/10/2009)
RBarryYoung (8/10/2009)
To clarify: HashBytes w/ SHA1 is nigh-unbreakable.

Hey Barry,

On the other hand, it is relatively slow, only works on strings, and returns varbinary(8000).
If the task is to detect changes in a row of data, would you use HashBytes (SHA1) alone?

Paul


In Lowell's example, it also turned a 13 byte string into a 20 byte Varbinary so it's expensive, too.

But, I believe that Barry's point was simply that it is unbreakable... not like CHECKSUM at all. :-)

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
Paul White
Paul White
SSChampion
SSChampion (11K reputation)SSChampion (11K reputation)SSChampion (11K reputation)SSChampion (11K reputation)SSChampion (11K reputation)SSChampion (11K reputation)SSChampion (11K reputation)SSChampion (11K reputation)

Group: General Forum Members
Points: 11448 Visits: 11354
Jeff Moden (8/10/2009)
In Lowell's example, it also turned a 13 byte string into a 20 byte Varbinary so it's expensive, too. But, I believe that Barry's point was simply that it is unbreakable... not like CHECKSUM at all. :-)

So...SHA1 always returns 160 bits (20 bytes) regardless of the input. HashBytes also appears to be limited to 8000 bytes of input - 8000 ANSI characters or 4000 Unicode. Finally, it doesn't appear unbreakable either: Wikipedia Link.

I mentioned HashBytes way back - now I'm just pointing out some of the reasons that I have yet to use it in a real production system.

Paul



Paul White
SQLPerformance.com
SQLblog.com
@SQL_Kiwi
RBarryYoung
RBarryYoung
SSChampion
SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)

Group: General Forum Members
Points: 10320 Visits: 9517
Paul White (8/10/2009)
RBarryYoung (8/10/2009)
To clarify: HashBytes w/ SHA1 is nigh-unbreakable.

Hey Barry,

On the other hand, it is relatively slow, only works on strings, and returns varbinary(8000).
If the task is to detect changes in a row of data, would you use HashBytes (SHA1) alone?

Paul

Despite the datatype, I think that it actually only returns 19 bytes. And since VARBINARY is a string and *everything* converts to varbinary very easily, thats not that big a problem (nulls are a bigger problem). So yeah, I *might* use it in situations were I was doing remote comparisons through linked servers and just recording the hash every day for later comparison. The problem with the field by field comparison is that you have to have the entire previous record around to do it.

Though I think that I usually used MD5 in the past.

-- RBarryYoung, (302)375-0451 blog: MovingSQL.com, Twitter: @RBarryYoung
Proactive Performance Solutions, Inc.
"Performance is our middle name."
RBarryYoung
RBarryYoung
SSChampion
SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)

Group: General Forum Members
Points: 10320 Visits: 9517
Paul White (8/10/2009)
Jeff Moden (8/10/2009)
In Lowell's example, it also turned a 13 byte string into a 20 byte Varbinary so it's expensive, too. But, I believe that Barry's point was simply that it is unbreakable... not like CHECKSUM at all. :-)

So...SHA1 always returns 160 bits (20 bytes) regardless of the input. HashBytes also appears to be limited to 8000 bytes of input - 8000 ANSI characters or 4000 Unicode. Finally, it doesn't appear unbreakable either: Wikipedia Link.

I mentioned HashBytes way back - now I'm just pointing out some of the reasons that I have yet to use it in a real production system.

Paul

Hmm, I though that I counted 19 bytes... Anyway, qualifying myself again :-), "nigh-unbreakable" was referring to its use as a signature in detecting random data changes, not as a security device.

-- RBarryYoung, (302)375-0451 blog: MovingSQL.com, Twitter: @RBarryYoung
Proactive Performance Solutions, Inc.
"Performance is our middle name."
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search