SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


HashBytes sometimes adds trailing zeroes


HashBytes sometimes adds trailing zeroes

Author
Message
WilburSmith
WilburSmith
Ten Centuries
Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)

Group: General Forum Members
Points: 1327 Visits: 247
Hi,

declare @Paper varchar(100), @Value_date datetime2;
select @Paper = 'Kotimaiset osakaslainat, kiint.', @Value_date = '2017-11-30';

select CAST(HASHBYTES('MD5', CAST(@Value_date AS char(10)) + ' # ' + LTrim(Rtrim(@Paper))) AS VARBINARY(40)) dv_key ;

0xB54E921E9BF81A19E76D8D15D16AD000

But when inserted to varbinary(40) column in is modified to 0xB54E921E9BF81A19E76D8D15D16AD0

Two trailing zeroes are removed

When I compare the incoming 0xB54E921E9BF81A19E76D8D15D16AD000 to 0xB54E921E9BF81A19E76D8D15D16AD0 in Informatica lookup, it considers them different and the row is inserted to table and the causing an error due duplicate value in a primary key column...

Why does the HAshBytes function return those two trailing zeroes?
Can I get rid of them?

Ville
Evgeny Garaev
Evgeny Garaev
SSCrazy
SSCrazy (2.1K reputation)SSCrazy (2.1K reputation)SSCrazy (2.1K reputation)SSCrazy (2.1K reputation)SSCrazy (2.1K reputation)SSCrazy (2.1K reputation)SSCrazy (2.1K reputation)SSCrazy (2.1K reputation)

Group: General Forum Members
Points: 2144 Visits: 1109
In my case the result is consistent, I used this script:

declare @Paper varchar(100), @Value_date datetime2;

declare @result VARBINARY(40);

select @Paper = 'Kotimaiset osakaslainat, kiint.', @Value_date = '2017-11-30';

declare @tst table(result VARBINARY(40));

create table dbo.tst(result VARBINARY(40));

select CAST(HASHBYTES('MD5', CAST(@Value_date AS char(10)) + ' # ' + LTrim(Rtrim(@Paper))) AS VARBINARY(40)) dv_key;

set @result = HASHBYTES('MD5', CAST(@Value_date AS char(10)) + ' # ' + LTrim(Rtrim(@Paper)));

select @result;

insert into @tst values(HASHBYTES('MD5', CAST(@Value_date AS char(10)) + ' # ' + LTrim(Rtrim(@Paper))));

select * from @tst;

insert into dbo.tst values(HASHBYTES('MD5', CAST(@Value_date AS char(10)) + ' # ' + LTrim(Rtrim(@Paper))));

select * from dbo.tst;

drop table dbo.tst;


WilburSmith
WilburSmith
Ten Centuries
Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)

Group: General Forum Members
Points: 1327 Visits: 247
Evgeny - Thursday, December 14, 2017 4:20 PM
In my case the result is consistent, I used this script:

declare @Paper varchar(100), @Value_date datetime2;

declare @result VARBINARY(40);

select @Paper = 'Kotimaiset osakaslainat, kiint.', @Value_date = '2017-11-30';

declare @tst table(result VARBINARY(40));

create table dbo.tst(result VARBINARY(40));

select CAST(HASHBYTES('MD5', CAST(@Value_date AS char(10)) + ' # ' + LTrim(Rtrim(@Paper))) AS VARBINARY(40)) dv_key;

set @result = HASHBYTES('MD5', CAST(@Value_date AS char(10)) + ' # ' + LTrim(Rtrim(@Paper)));

select @result;

insert into @tst values(HASHBYTES('MD5', CAST(@Value_date AS char(10)) + ' # ' + LTrim(Rtrim(@Paper))));

select * from @tst;

insert into dbo.tst values(HASHBYTES('MD5', CAST(@Value_date AS char(10)) + ' # ' + LTrim(Rtrim(@Paper))));

select * from dbo.tst;

drop table dbo.tst;


Hi,
yes so it seems so.

The difference is that in my case the insert goes through Informatica ETL tool. There is something with the Unix ODBC API or in the tool itself that causes this. I know I must try to investigate that.

But

WilburSmith
WilburSmith
Ten Centuries
Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)

Group: General Forum Members
Points: 1327 Visits: 247
Hi,

yes so it seems...

The difference is that in my case the insert goes through Informatica ETL tool. There is something with the Unix ODBC API or in the tool itself that causes this. I know I must try to investigate that.

But since 0xB54E921E9BF81A19E76D8D15D16AD000 and 0xB54E921E9BF81A19E76D8D15D16AD0 are the same value atleast when comparing in Where clause

select 'Hello' where 0xB54E921E9BF81A19E76D8D15D16AD000 = 0xB54E921E9BF81A19E76D8D15D16AD0 ==> 'Hello'
but
select 'Hello' where 0xB54E921E9BF81A19E76D8D15D16AD0000 = 0xB54E921E9BF81A19E76D8D15D16AD0 ==> Nothing
and then again
select 'Hello' where 0xB54E921E9BF81A19E76D8D15D16AD00000 = 0xB54E921E9BF81A19E76D8D15D16AD0 ==> 'Hello'

So a even number of trailing zeroes does not change the value.

Suppose I don't find any explanation/reason from Informatica, is there anything I can do with T-SQL?

Ville
sgmunson
sgmunson
SSC-Forever
SSC-Forever (43K reputation)SSC-Forever (43K reputation)SSC-Forever (43K reputation)SSC-Forever (43K reputation)SSC-Forever (43K reputation)SSC-Forever (43K reputation)SSC-Forever (43K reputation)SSC-Forever (43K reputation)

Group: General Forum Members
Points: 43618 Visits: 5422
I suspect you're getting an implicit conversion issue. Somewhere in your processing, that data type of VARBINARY(40) is getting implicitly converted, and that succeeds because the resulting data type is likely some kind of character string, where an additional 0x00 worth of binary data is irrelevant, because most string data types using a binary 0x00 to terminate the string, down at the lowest level. Thus the comparison only processes up to the string terminator. Figure out why you need varbinary() or figure out where the implicit conversion occurs. The only question that remains is understanding how varbinary values are terminated. I don't know that answer...

Steve
‌(aka sgmunson)
Smile Smile Smile
Health & Nutrition
Lynn Pettis
Lynn Pettis
SSC Guru
SSC Guru (225K reputation)SSC Guru (225K reputation)SSC Guru (225K reputation)SSC Guru (225K reputation)SSC Guru (225K reputation)SSC Guru (225K reputation)SSC Guru (225K reputation)SSC Guru (225K reputation)

Group: General Forum Members
Points: 225810 Visits: 40423
Also, the MD5 hash is a 16 byte value. Don't count the leading 0x. You can change the VARBINARY(40) to VARBINARY(16) and hashbytes returns the same value. So, there may be a truncation occurring going into your Informatica process. Sorry, I haven't touched Informatica in over 12 years.

Cool
Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
WilburSmith
WilburSmith
Ten Centuries
Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)

Group: General Forum Members
Points: 1327 Visits: 247
Hi,

OK. Binary(16) would solve the problem. There is only one minor problem...there is a big DataVault solely based on varbinary(40) fields - too big a job to start changing data types and Informatica loads...

Truncation in the load is not the case, all fports are varbinaru(40)

Thnaks, Ville
Lynn Pettis
Lynn Pettis
SSC Guru
SSC Guru (225K reputation)SSC Guru (225K reputation)SSC Guru (225K reputation)SSC Guru (225K reputation)SSC Guru (225K reputation)SSC Guru (225K reputation)SSC Guru (225K reputation)SSC Guru (225K reputation)

Group: General Forum Members
Points: 225810 Visits: 40423
WilburSmith - Thursday, December 28, 2017 11:14 AM
Hi,

OK. Binary(16) would solve the problem. There is only one minor problem...there is a big DataVault solely based on varbinary(40) fields - too big a job to start changing data types and Informatica loads...

Truncation in the load is not the case, all fports are varbinaru(40)

Thnaks, Ville

This value: 0xB54E921E9BF81A19E76D8D15D16AD000, is 16 bytes. HASHBYTES did NOT add trailing bytes to the value. If the final two zeros are being dropped, it is not HASHBYTES causing the problem. I get the same value whether I run HASHBYTES alone or cast it VARBINARY(40).


Cool
Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
WilburSmith
WilburSmith
Ten Centuries
Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)

Group: General Forum Members
Points: 1327 Visits: 247
Hi,

yes I agree, it is not a hashbytes problem. instead Informatica removes the two trailing zeroes when inserting the row to db. Why..? Remains a mystery.

ville
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum








































































































































































SQLServerCentral


Search