SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Comparing a hash


Comparing a hash

Author
Message
William Vach
William Vach
Hall of Fame
Hall of Fame (3.3K reputation)Hall of Fame (3.3K reputation)Hall of Fame (3.3K reputation)Hall of Fame (3.3K reputation)Hall of Fame (3.3K reputation)Hall of Fame (3.3K reputation)Hall of Fame (3.3K reputation)Hall of Fame (3.3K reputation)

Group: General Forum Members
Points: 3256 Visits: 1743
Good question! I learned something today.
SQLRNNR
SQLRNNR
SSC-Dedicated
SSC-Dedicated (33K reputation)SSC-Dedicated (33K reputation)SSC-Dedicated (33K reputation)SSC-Dedicated (33K reputation)SSC-Dedicated (33K reputation)SSC-Dedicated (33K reputation)SSC-Dedicated (33K reputation)SSC-Dedicated (33K reputation)

Group: General Forum Members
Points: 33098 Visits: 18560
Thanks. I, too, learned something new today.



Jason AKA CirqueDeSQLeil
I have given a name to my pain...
MCM SQL Server, MVP


SQL RNNR

Posting Performance Based Questions - Gail Shaw

Christian Buettner-167247
Christian Buettner-167247
Hall of Fame
Hall of Fame (3.6K reputation)Hall of Fame (3.6K reputation)Hall of Fame (3.6K reputation)Hall of Fame (3.6K reputation)Hall of Fame (3.6K reputation)Hall of Fame (3.6K reputation)Hall of Fame (3.6K reputation)Hall of Fame (3.6K reputation)

Group: General Forum Members
Points: 3627 Visits: 3889
Now this one still bothers me...

Why does it execute the no part after the 8125 error?
By the way, it only executes it with the following SET options:
SET XACT_ABORT OFF
SET ANSI_WARNINGS ON



Funny enough, one would think that if the warnings are on, the error would stop execution, but instead you need to turn warnings off to make SQL Server stop the execution. And unfortunately BOL does not really assist in understanding this.

At least the XACT_ABORT setting makes some kind of sense with regards to the error.

Best Regards,

Chris Büttner
Saurabh Dwivedy
Saurabh Dwivedy
Mr or Mrs. 500
Mr or Mrs. 500 (538 reputation)Mr or Mrs. 500 (538 reputation)Mr or Mrs. 500 (538 reputation)Mr or Mrs. 500 (538 reputation)Mr or Mrs. 500 (538 reputation)Mr or Mrs. 500 (538 reputation)Mr or Mrs. 500 (538 reputation)Mr or Mrs. 500 (538 reputation)

Group: General Forum Members
Points: 538 Visits: 340


SELECT HASHBYTES('SHA1', REPLICATE(CAST('A' AS varchar(max)), 8000)) returns a 20-byte value, and SELECT HASHBYTES('SHA1', REPLICATE(CAST('A' AS varchar(max)), 9000)) returns the same warning message and no value.

Why the IF comparison between the two doesn't abort but chooses to execute the ELSE clause is, in all honesty, beyond me. I firsth thought it was because the too-long input causes the function to return NULL (in addition to the message), but when I use IF HASHBYTES(...) IS NULL, I still got the "no" result.

Anyhoo, I stand corrected. The HASHBYTES does apparently indeed have an (undocumented!!) limit of 8000 bytes on it's input string. Thanks for pointing that out!


Despite the fact that I knew that Hashbytes has an 8000 byte limit on its input, I was stumped at the very point you have highlighted above - I thought it would error out. But it doesn't.

The interesting thing I found when I did some testing on my sql server 2008 developer edition is that:

When I run the code -

select HASHBYTES('sha1', REPLICATE(cast('a' as varchar(max)), 9000))

I get the error msg as indicated by you.

BUT when I run

select HASHBYTES('sha1', REPLICATE('a', 9000))

I do not get an error but a certain Hashbyte output. I realized further that actually, Replicate('a', 9000) was actually Replicating the 'A' value upto 8000 bytes and ignoring the rest. It wasn't erroring out. So in effect, the input condition to HASHBYTES in the question posted wasn't getting violated as presented in the QoTD.
"
However, casting the 'A' to Varchar(Max) and then replicating it 9000 times produces 9000 "A's" and THIS causes the input to HASHBYTES to exceed the permissible limit.

Running this code produces an ERROR Message BUT and the Output NO.

IF HASHBYTES('SHA1', REPLICATE(cast('A' as varchar(max)), 9000)) = HASHBYTES('SHA1', REPLICATE('A', 8000))
PRINT 'yes.'
ELSE PRINT 'no.'

Saurabh Dwivedy
___________________________________________________________

My Blog: http://tinyurl.com/dwivedys

For better, quicker answers, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537

Be Happy!
john.arnott
john.arnott
SSCrazy
SSCrazy (2K reputation)SSCrazy (2K reputation)SSCrazy (2K reputation)SSCrazy (2K reputation)SSCrazy (2K reputation)SSCrazy (2K reputation)SSCrazy (2K reputation)SSCrazy (2K reputation)

Group: General Forum Members
Points: 2024 Visits: 3059
Hugo Kornelis (2/1/2010)

SELECT HASHBYTES('SHA1', REPLICATE(CAST('A' AS varchar(max)), 8000)) returns a 20-byte value, and SELECT HASHBYTES('SHA1', REPLICATE(CAST('A' AS varchar(max)), 9000)) returns the same warning message and no value. Why the IF comparison between the two doesn't abort but chooses to execute the ELSE clause is, in all honesty, beyond me. I firsth thought it was because the too-long input causes the function to return NULL (in addition to the message), but when I use IF HASHBYTES(...) IS NULL, I still got the "no" result.


This piqued my curiosity. Could the "no" result from your direct test

IF HASHBYTES('SHA1', REPLICATE(CAST('A' AS varchar(max)), 9000))
is NULL
PRINT 'yes.';
ELSE
PRINT 'no.';


...be because the result of the IF test is NULL? When the HASHBYTES() function fails with the input truncation error, it does not return NULL, but rather doesn't return anything. Then, the IF statement executes and doesn't find a NULL, so returns FALSE, which of course leads to the "Print 'No'" branch.

Running in SSMS with results to text makes it a bit easier to see because it says "NULL" for null results and nothing for the failed HASHBYTES().
select  IsNull(HASHBYTES('SHA1', REPLICATE(CAST('A' AS varchar(max)), 9000)),0x2222aaaa9999) 
select HASHBYTES('SHA1', REPLICATE(CAST('A' AS varchar(max)), 4000))
Declare @MyNull varbinary
select @myNull


Returns

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Msg 8152, Level 16, State 10, Line 1
String or binary data would be truncated.


------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
0xD5FE2200837AB7B4FB073787F8E531D6C6A8CDD2

(1 row(s) affected)


----
NULL

(1 row(s) affected)


vk-kirov
vk-kirov
Hall of Fame
Hall of Fame (3.7K reputation)Hall of Fame (3.7K reputation)Hall of Fame (3.7K reputation)Hall of Fame (3.7K reputation)Hall of Fame (3.7K reputation)Hall of Fame (3.7K reputation)Hall of Fame (3.7K reputation)Hall of Fame (3.7K reputation)

Group: General Forum Members
Points: 3736 Visits: 4408
john.arnott (2/2/2010)
When the HASHBYTES() function fails with the input truncation error, it does not return NULL, but rather doesn't return anything.

But usually SQL Server considers "nothing" as a NULL value.

Here is the example:
declare @tab table (a int)

select a from @tab

if (select a from @tab) is null
print 'null'
else
print 'not null'



The result is:
a
-----------

(0 row(s) affected)

null


As we can see, "nothing" = "null" in this case.

I think that the warning (or error, or exception, or whatever) plays its important role.

Here is the modified example:
declare @tab table (a int)

select a from @tab where 1 = 1/0

if (select a from @tab where 1 = 1/0) is null
print 'null'
else
print 'not null'



The result is:
a
-----------
Msg 8134, Level 16, State 1, Line 3
Divide by zero error encountered.

Msg 8134, Level 16, State 1, Line 5
Divide by zero error encountered.
not null

Does it look like "nothing" <> "null"? I don't think so, because of the error message. I think that SQL Server executes "select a from @tab where 1 = 1/0", gets an error, and does not even try to compare this with NULL. Since there's no TRY..CATCH block, batch execution continues and... steps into the ELSE block. But why is the ELSE block not skipped? Maybe it's "by design" :-)
john.arnott
john.arnott
SSCrazy
SSCrazy (2K reputation)SSCrazy (2K reputation)SSCrazy (2K reputation)SSCrazy (2K reputation)SSCrazy (2K reputation)SSCrazy (2K reputation)SSCrazy (2K reputation)SSCrazy (2K reputation)

Group: General Forum Members
Points: 2024 Visits: 3059
The "inconsistencies" I point to in this post may actually be controlled by the use of settings as pointed out by Paul White in a post later in this thread. Here's that post:


Paul White NZ (3/30/2010)
Everything that has been observed, regarding overflows, truncations, ELSE clauses executing and so on...is all documented in Books Online, under the following topics:

SET ANSI_WARNINGS
SET ARITHABORT
SET ARITHIGNORE

The various behaviours described in this thread can all be varied, and explained, by reading those references.

The other thing is that a scalar sub-query like in the example (SELECT a FROM @tab) always returns a value - since it returns a column reference. If no rows are produced, the scalar sub-query presents a NULL.





vk-kirov,

Your example is consistent with the HASHBYTES example from Hugo. And it DOES look as though SQL is not treating "no result" as the same thing as "value undetermined" or NULL. But then, I suppose it makes a certain amount of logical sense that a test for NULL return FALSE when the thing being tested is not defined; and "not defined" is not the same as "has an undetermined value".

TSQL is inconsistent. A simple conversion error stops the processing.

Declare @x datetime

set @x = convert(datetime,'2010-0518')
If @x is NULL
print 'Null found'
else print '@x is not NULL'
--------------- Returns:
--Msg 241, Level 16, State 1, Line 3
--Conversion failed when converting datetime from character string.


But the inconsistency is that the HASHBYTES overflow allows the IF to continue, which finds that the local variable has an undefined value, expressed as NULL:
Declare @b varbinary
Set @b = HASHBYTES('SHA1', REPLICATE(CAST('A' AS varchar(max)), 9000))
If @b is NULL
print 'Null found'
else print '@b is not NULL'
--------------- Returns:
--Msg 8152, Level 16, State 10, Line 2
--String or binary data would be truncated.
--Null found


Finallly, testing the HASHBYTES function directly gets back to the original form. Again the error doesn't stop further processing, but this time, the execution result itself is undefined, so the test fails and the ELSE branch is followed. At this point, it doesn't matter whether you test for IS NULL or for IS NOT NULL.


If HASHBYTES('SHA1', REPLICATE(CAST('A' AS varchar(max)), 9000)) is NULL
print 'Null found'
else print 'Result is not NULL'

If HASHBYTES('SHA1', REPLICATE(CAST('A' AS varchar(max)), 9000)) is NOT NULL
print 'Null found'
else print 'Result is not NULL'

-- --------------------- Returns:
--Msg 8152, Level 16, State 10, Line 2
--String or binary data would be truncated.
--Result is not NULL
--Msg 8152, Level 16, State 10, Line 6
--String or binary data would be truncated.
--Result is not NULL


------
edited to add preface with post from Paul.
------
David Data
David Data
Old Hand
Old Hand (345 reputation)Old Hand (345 reputation)Old Hand (345 reputation)Old Hand (345 reputation)Old Hand (345 reputation)Old Hand (345 reputation)Old Hand (345 reputation)Old Hand (345 reputation)

Group: General Forum Members
Points: 345 Visits: 810
"Inconsistent" - indeed. I thought this should generate an error, so I don't get the 1 point.

But this is a bug in SQL Server! Given one often uses hashing to verify things are identical when confirming data integrity, a hash which treats two different strings as identical is a a serious failure. The string size limit for HASHBYTES may well be 8,000 (4,000 for nvarchar), but larger strings HASHBYTES should NOT result in the hash of a truncated string - it should produce an ERROR. Or allow long strings when the data type permits them.

Has anyone raised this as a bug yet?
Hugo Kornelis
Hugo Kornelis
SSChampion
SSChampion (11K reputation)SSChampion (11K reputation)SSChampion (11K reputation)SSChampion (11K reputation)SSChampion (11K reputation)SSChampion (11K reputation)SSChampion (11K reputation)SSChampion (11K reputation)

Group: General Forum Members
Points: 11198 Visits: 12004
a hash which treats two different strings as identical is a a serious failure.
No, it's not. Collisions in a hash are by design. After all, the possible number of different values in a 8000-character string is much larger than the possible number of 160-but hash values.

That's not what causes the problem here, though. As I'm sure you understand after reading all the other replies. The real problem is that REPLICATE uses the same input as output data type, which in this case has an 8000-byte limit, and that longer values are truncated. Raising a bug is pointless, as this is all both by design and well documented.


Hugo Kornelis, SQL Server MVP
Visit my SQL Server blog: http://sqlblog.com/blogs/hugo_kornelis
john.arnott
john.arnott
SSCrazy
SSCrazy (2K reputation)SSCrazy (2K reputation)SSCrazy (2K reputation)SSCrazy (2K reputation)SSCrazy (2K reputation)SSCrazy (2K reputation)SSCrazy (2K reputation)SSCrazy (2K reputation)

Group: General Forum Members
Points: 2024 Visits: 3059
The real problem is that REPLICATE uses the same input as output data type, which in this case has an 8000-byte limit, and that longer values are truncated.

I would have thought that although this reasonable (if not obvious) behavior of REPLICATE led to the two HASHBYTES calls evaluating identical strings in the original QOD, the real problem is that when HASHBYTES fails due to input-overflow (when we feed it a varchar(max), it apparently does not present a result of NULL that may be tested. Rather, as I showed in my previous post, an IF statement testing the output of HASHBYTES goes to the FALSE branch regardless of whether you test for "IS NULL" or for "IS NOT NULL". That would seem more of a candidate for bug status.
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search