SHA256 using HASHBYTES function

  • I'm trying to perform SHA256 hashing, using the standard HASHBYTES function (2012). The goal is to join 2 tables: one hashed internally in SQLServer, the other imported (already hashed in a different platform). Sort of this:

    TABLE 1  (SQLserver) -> Hash it SHA256      <= INNER JOIN =>     TABLE 2 (imported into same SQLServer from .CSV file)

    I've done some testing before the matching, but the output of the HASHBYTES function is unexpected. E.g.

    DECLARE @HashThis nvarchar(4000);
    SELECT @HashThis = CONVERT(nvarchar(4000), '123');
    SELECT HASHBYTES('SHA2_256', @HashThis) ;
    GO

    produces as output:
    0x26D6A8AD97C75FFC548F6873E5E93CE475479E3E1A1097381E54221FB53EC1D2

    However, if I run SHA256 in R, or in any of the free SHA256 websites, the correct output for '123' is:

    a665a45920422f9d417e4867efdc4fb8a04a1f3fff1fa07e998e86f7f7a27ae3

    Initially I thought they don't match because my output is a binary var, but no: even if you convert to nvarchar, you never get the same output. My questions:

    -   What is going on here?  Is this because of the coding language in the server (UTF, UNICODE, other. sorry not familiar with language codes)?
    -   How can I make sure that, when using HASHBYTES-SHA256, the output matches the 'standard' output shown above?  
    -    Even if I fix the previous issue, the 0x at the start baffles me a bit. Is is a blank char? Could this cause trouble when trying to match the 2 tables?

    Thanks in advance,

  • a_ud - Thursday, July 6, 2017 4:35 AM

    I'm trying to perform SHA256 hashing, using the standard HASHBYTES function (2012). The goal is to join 2 tables: one hashed internally in SQLServer, the other imported (already hashed in a different platform). Sort of this:

    TABLE 1  (SQLserver) -> Hash it SHA256      <= INNER JOIN =>     TABLE 2 (imported into same SQLServer from .CSV file)

    I've done some testing before the matching, but the output of the HASHBYTES function is unexpected. E.g.

    DECLARE @HashThis nvarchar(4000);
    SELECT @HashThis = CONVERT(nvarchar(4000), '123');
    SELECT HASHBYTES('SHA2_256', @HashThis) ;
    GO

    produces as output:
    0x26D6A8AD97C75FFC548F6873E5E93CE475479E3E1A1097381E54221FB53EC1D2

    However, if I run SHA256 in R, or in any of the free SHA256 websites, the correct output for '123' is:

    a665a45920422f9d417e4867efdc4fb8a04a1f3fff1fa07e998e86f7f7a27ae3

    Initially I thought they don't match because my output is a binary var, but no: even if you convert to nvarchar, you never get the same output. My questions:

    -   What is going on here?  Is this because of the coding language in the server (UTF, UNICODE, other. sorry not familiar with language codes)?
    -   How can I make sure that, when using HASHBYTES-SHA256, the output matches the 'standard' output shown above?  
    -    Even if I fix the previous issue, the 0x at the start baffles me a bit. Is is a blank char? Could this cause trouble when trying to match the 2 tables?

    Thanks in advance,

    The SHA2 hash you posted as correct is the ascii string hash, not the unicode hash
    😎

    DECLARE @HashThis varchar(4000);
    SELECT @HashThis = '123';
    SELECT HASHBYTES('SHA2_256', @HashThis) ;

    Produces
    0xA665A45920422F9D417E4867EFDC4FB8A04A1F3FFF1FA07E998E86F7F7A27AE3

    which is the same as  in R, or in any SHA256 websites

  • Thanks for the prompt reply. As said not very familiar with code chars, or how to switch from one to another.

    • Does SQL Server then use UNICODE by default   (while R & websites use ASCII )?  Is this set by the server OS, or the client OS?

    • Do other popular databases, e.g. mySQL, also use UNICODE?  

    My interest is to perform a JOIN, therefore if I can't switch the charset of both ends (particularly the SQLServer one), it will not work.

  • a_ud - Thursday, July 6, 2017 4:58 AM

    Thanks for the prompt reply. As said not very familiar with code chars, or how to switch from to another.

    • Does SQL Server then use UNICODE by default   (while R & websites use ASCII )?  Is this set by the server OS, or the client OS?

    • Do other popular databases, e.g. mySQL, also use UNICODE?  

    My interest is to perform a JOIN, therefore if I can't switch the charset of both ends (particularly the SQLServer one), it will not work.

    1) SQL Server uses which ever type you decide to use, i.e. NVARCHAR for unicode and VARCHAR for ascii strings.
    2) Most RDBMS support both unicode and ascii
    😎

    If you cannot change the type in SQL Server then you can use convert inside the hash function
    DECLARE @HashThis NVARCHAR(4000);
    SELECT @HashThis = N'123';
    SELECT HASHBYTES('SHA2_256', CONVERT(VARCHAR(4000),@HashThis,0)) ;

  • a_ud - Thursday, July 6, 2017 4:58 AM

    Does SQL Server then use UNICODE by default   (while R & websites use ASCII )?  Is this set by the server OS, or the client OS?

    No, no and no.
    It was using unicode because you declared the variable to be hashed as a unicode string.

    DECLARE @HashThis nvarchar(4000);

    And the 0x at the start indicates that it's a binary string (hexadecimal characters)

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • That worked fine, and I learnt something new !!

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply