Search code examples
pythonhaskellhashmismatchmurmurhash

Murmurhash 2 results on Python and Haskell


Haskell and Python don't seem to agree on Murmurhash2 results. Python, Java, and PHP returned the same results but Haskell don't. Am I doing something wrong regarding Murmurhash2 on Haskell?

Here is my code for Haskell Murmurhash2:

import Data.Digest.Murmur32

    main = do
    print $ asWord32 $ hash32WithSeed 1 "woohoo"

And here is the code written in Python:

import murmur

if __name__ == "__main__":
    print murmur.string_hash("woohoo", 1)

Python returned 3650852671 while Haskell returned 3966683799


Solution

  • The murmur-hash package (I am its author) does not promise to compute the same hashes as other languages. If you rely on hashes to be compatible with other software that computes hashes I suggest you create newtype wrappers that compute hashes the way you want them. For text, in particular, you need to at least specify the encoding. In your case you could convert the text to an ASCII string using Data.ByteString.Char8.pack, but that still doesn't give you the same hash since the ByteString instance is more of a placeholder.

    BTW, I'm not actively improving that package because MurmurHash2 has been superseded by MurmurHash3, but I keep accepting patches.