When I am working with none-English characters the generated hash is not the same but otherwise, everything is fine. does anyone have a solution to get the same result?
MYSQL
SELECT MD5( 'سلام')
------------- result: 78903c575b0dda53c4a7644a2dd36d0e
SQL-Server
SELECT CONVERT(VARCHAR(50), HASHBYTES('MD5', 'سلام'), 2)
-------------- result:3C50F6458899B3C0988BE358290F5F24
SELECT CONVERT(nVARCHAR(50), HASHBYTES('MD5', N'سلام'), 2)
-------------- result:0381CA5081FBC68B2F55F2F2C21399D7
Based on MySQL returning D8B3D984D8A7D985
from SELECT HEX(CAST('سلام' as binary))
it appears that MySQL is using UTF-8 character encoding.
On SQL Server 2019 you can use the LATIN1_GENERAL_100_CI_AS_SC_UTF8 collation (it's not supported on SQL Server 2017) such as the following:
create table #Test (
UTF16 nvarchar(max),
UTF8 varchar(max) COLLATE LATIN1_GENERAL_100_CI_AS_SC_UTF8
)
insert #Test values (N'سلام', N'سلام');
select UTF16 from #Test;
select CAST(UTF16 as varbinary) as [UTF16-Bytes] from #Test;
select UTF8 from #Test;
select CAST(UTF8 as varbinary) as [UTF8-Bytes] from #Test;
Which returns:
UTF16
سلام
UTF16-Bytes
0x3306440627064506
UTF8
سلام
UTF8-Bytes
0xD8B3D984D8A7D985
And then with hashbytes():
select hashbytes('MD5', cast(UTF16 as varbinary)) as [UTF16-Hash] from #Test;
select hashbytes('MD5', cast(UTF8 as varbinary)) as [UTF8-Hash] from #Test;
Which returns:
UTF16-Hash
0x0381CA5081FBC68B2F55F2F2C21399D7
UTF8-Hash
0x78903C575B0DDA53C4A7644A2DD36D0E
Hope this helps!