Search code examples
javahashsearch-enginehashcode

Is java's hashCode() deterministic?


is java's hashCode() deterministic?

I try to implement a document search engine that uses the minhashing algorithm and I use hashCode to pre-hash words. Is the same word going to get the same hash every time that I run it?

Is it going to get the same hash even if I run it from a different machine (32 bit vs 64bit)?


Solution

  • It depends on the class you are referring to. Base Object.hashCode implementation is not, since, as stated in the documentation:

    As much as is reasonably practical, the hashCode method defined by class Object does return distinct integers for distinct objects. (This is typically implemented by converting the internal address of the object into an integer, but this implementation technique is not required by the JavaTM programming language.)

    Addresses are not deterministic, consider that sometimes they are even used as a source of entropy.

    But, for instance, String has a deterministic hash code determined as follows:

    Formula from Wikpedia

    (image taken from Wikipedia)

    In some cases there is not even a sensible deterministic definition for the hash code.