Search code examples
c#sql-serversha256hashbytes

SHA256 doesn't yield same result between C# and SQL Server when using Spanish words


I am using SHA256 to get a hashed hexadecimal string. It works fine when using regular characters, however when a string to be hashed includes accents/diacritics, then I get a different result in C# and T-SQL. I would prefer to make the change on the SQL Server end.

  • Example word that does match: bird
  • Example word that does not match: MUÑOZ

C#

using (SHA256 sha2 = SHA256.Create())  
{
    var hash = sha2.ComputeHash(Encoding.UTF8.GetBytes(fullAddress));
    string hexString = string.Empty;

    for (int i = 0; i < hash.Length; i++)
    {
        hexString += hash[i].ToString("X2"); //Convert the byte to Hexadecimal representation, Notice that we use "X2" instead of "X"
    }

    sha2.Dispose();
    return hexString;
}

SQL

declare @fullAddress nvarchar(500)
set @fullAddress = 'MUÑOZ'
select CONVERT([varchar](256), HASHBYTES('SHA2_256', @fullAddress), 2) 

Solution

  • .NET, Windows and SQL Server use UTF16, not UTF8. The two snippets are hashing different bytes. When the same encoding is used, the hash strings are identical.

    This :

    using (var sha2 = System.Security.Cryptography.SHA256.Create())  
    {
        var hash = sha2.ComputeHash(Encoding.Unicode.GetBytes("MUÑOZ"));
        {
            string hexString = string.Empty;
    
            for (int i = 0; i < hash.Length; i++)
            {
                hexString += hash[i].ToString("X2");
            }
            Console.WriteLine(hexString);        
        }    
    }
    

    Produces :

    276DB000BF524070F106A2C413942159AB5EF2F5CA5A5B91AB2F3B6FA48EE1ED
    

    Which is the same as SQL Server's hash string