I'm using Murmurhash3 to create unique hashes for text entries. When text entries are created, I'm using this php implementation, which returns a 32 bit hash integer, to get the hash value. The hash is stored in a BINARY(16) database column. I also need to update our existing database so I'm using this MySql implementation to update the database. In order to match the php created hash, I'm base converting it and lower-casing it.
UPDATE column SET hash=LOWER(CONV(murmur_hash_v3(CONCAT(column1, column2), 0), 10, 32));
It matches the php version about 80% of the time, which obviously isn't going to cut it. For example, hashing the string 'engtest' creates 15d15m
in php and 3uqiuqa
in MySql. However, the string 'engtest sentence' creates the same hash in both. What could I be doing wrong?
Figured it out. PHP's integer type is signed and occasionally Murmurhash was producing negative hash values that didnt match the always positive MySql values. The solution was to format php's hash value using sprintf with format set to "%u" before the base conversion.
$hash = murmurhash3_int($text);
return base_convert(sprintf("%u\n", $hash), 10, 32);
See the php crc32 docs for more info.