HashBytes sometimes adds trailing zeroes

  • Hi,

    declare @Paper varchar(100), @Value_date datetime2;
    select @Paper = 'Kotimaiset osakaslainat, kiint.', @Value_date = '2017-11-30';

    select CAST(HASHBYTES('MD5', CAST(@Value_date AS char(10)) + ' # ' + LTrim(Rtrim(@Paper))) AS VARBINARY(40)) dv_key ;

    0xB54E921E9BF81A19E76D8D15D16AD000

    But when inserted to varbinary(40) column in is modified to 0xB54E921E9BF81A19E76D8D15D16AD0

    Two trailing zeroes are removed

    When I compare the incoming 0xB54E921E9BF81A19E76D8D15D16AD000 to 0xB54E921E9BF81A19E76D8D15D16AD0 in Informatica lookup, it considers them different and the row is inserted to table and the causing an error due duplicate value in a primary key column...

    Why does the HAshBytes function return those two trailing zeroes?
    Can I get rid of them?

    Ville

  • In my case the result is consistent, I used this script:

    declare @Paper varchar(100), @Value_date datetime2;

    declare @result VARBINARY(40);

    select @Paper = 'Kotimaiset osakaslainat, kiint.', @Value_date = '2017-11-30';

    declare @tst table(result VARBINARY(40));

    create table dbo.tst(result VARBINARY(40));

    select CAST(HASHBYTES('MD5', CAST(@Value_date AS char(10)) + ' # ' + LTrim(Rtrim(@Paper))) AS VARBINARY(40)) dv_key;

    set @result = HASHBYTES('MD5', CAST(@Value_date AS char(10)) + ' # ' + LTrim(Rtrim(@Paper)));

    select @result;

    insert into @tst values(HASHBYTES('MD5', CAST(@Value_date AS char(10)) + ' # ' + LTrim(Rtrim(@Paper))));

    select * from @tst;

    insert into dbo.tst values(HASHBYTES('MD5', CAST(@Value_date AS char(10)) + ' # ' + LTrim(Rtrim(@Paper))));

    select * from dbo.tst;

    drop table dbo.tst;

  • Evgeny - Thursday, December 14, 2017 4:20 PM

    In my case the result is consistent, I used this script:

    declare @Paper varchar(100), @Value_date datetime2;

    declare @result VARBINARY(40);

    select @Paper = 'Kotimaiset osakaslainat, kiint.', @Value_date = '2017-11-30';

    declare @tst table(result VARBINARY(40));

    create table dbo.tst(result VARBINARY(40));

    select CAST(HASHBYTES('MD5', CAST(@Value_date AS char(10)) + ' # ' + LTrim(Rtrim(@Paper))) AS VARBINARY(40)) dv_key;

    set @result = HASHBYTES('MD5', CAST(@Value_date AS char(10)) + ' # ' + LTrim(Rtrim(@Paper)));

    select @result;

    insert into @tst values(HASHBYTES('MD5', CAST(@Value_date AS char(10)) + ' # ' + LTrim(Rtrim(@Paper))));

    select * from @tst;

    insert into dbo.tst values(HASHBYTES('MD5', CAST(@Value_date AS char(10)) + ' # ' + LTrim(Rtrim(@Paper))));

    select * from dbo.tst;

    drop table dbo.tst;

    Hi,
    yes so it seems so.

    The difference is that in my case the insert goes  through Informatica ETL tool. There is something with the Unix ODBC API or in the tool itself that causes this. I know I must try to investigate that.

    But

  • Hi,

    yes so it seems...

    The difference is that in my case the insert goes through Informatica ETL tool. There is something with the Unix ODBC API or in the tool itself that causes this. I know I must try to investigate that.

    But since 0xB54E921E9BF81A19E76D8D15D16AD000 and 0xB54E921E9BF81A19E76D8D15D16AD0 are the same value atleast when comparing in Where clause

    select 'Hello' where 0xB54E921E9BF81A19E76D8D15D16AD000 = 0xB54E921E9BF81A19E76D8D15D16AD0 ==> 'Hello'
    but
    select 'Hello' where 0xB54E921E9BF81A19E76D8D15D16AD0000 = 0xB54E921E9BF81A19E76D8D15D16AD0 ==> Nothing
    and then again
    select 'Hello' where 0xB54E921E9BF81A19E76D8D15D16AD00000 = 0xB54E921E9BF81A19E76D8D15D16AD0 ==> 'Hello'

    So a even number of trailing zeroes does not change the value.

    Suppose I don't find any explanation/reason from Informatica, is there anything I can do with T-SQL?

    Ville

  • I suspect you're getting an implicit conversion issue.   Somewhere in your processing, that data type of VARBINARY(40) is getting implicitly converted, and that succeeds because the resulting data type is likely some kind of character string, where an additional 0x00 worth of binary data is irrelevant, because most string data types using a binary 0x00 to terminate the string, down at the lowest level.   Thus the comparison only processes up to the string terminator.   Figure out why you need varbinary() or figure out where the implicit conversion occurs.   The only question that remains is understanding how varbinary values are terminated.  I don't know that answer...

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • Also, the MD5 hash is a 16 byte value.  Don't count the leading 0x.  You can change the VARBINARY(40) to VARBINARY(16) and hashbytes returns the same value.  So, there may be a truncation occurring going into your Informatica process.  Sorry, I haven't touched Informatica in over 12 years.

  • Hi,

    OK. Binary(16) would solve the problem. There is only one minor problem...there is a big DataVault solely based on varbinary(40) fields - too big a job to start changing data types and Informatica loads...

    Truncation in the load is not the case, all fports are varbinaru(40)

    Thnaks, Ville

  • WilburSmith - Thursday, December 28, 2017 11:14 AM

    Hi,

    OK. Binary(16) would solve the problem. There is only one minor problem...there is a big DataVault solely based on varbinary(40) fields - too big a job to start changing data types and Informatica loads...

    Truncation in the load is not the case, all fports are varbinaru(40)

    Thnaks, Ville

    This value: 0xB54E921E9BF81A19E76D8D15D16AD000, is 16 bytes.  HASHBYTES did NOT add trailing bytes to the value.  If the final two zeros are being dropped, it is not HASHBYTES causing the problem.  I get the same value whether I run HASHBYTES alone or cast it VARBINARY(40).

  • Hi,

    yes I agree, it is not a hashbytes problem. instead Informatica removes the two trailing zeroes when inserting the row to db. Why..? Remains a mystery.

    ville

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply