Search code examples
javastringhashcodebackwards-compatibility

Why did Sun specify String.hashCode() implementation?


There seems to be an ongoing debate about whether it is safe to rely on the current implementation of String.hashCode() because, technically speaking, it is guaranteed by the specification (Javadoc).

  1. Why did Sun specify String.hashCode()'s implementation in the specification?
  2. Why would developers ever need to rely upon a specific implementation of hashCode()?
  3. Why is Sun so afraid that the sky will fall if String.hashCode() is changed in the future? (This is probably be explained by #2)

Solution

  • A reason for relying on the specific implementation of hashCode() would be if it is ever persisted out into a database, file or any other storage medium. Bad Things(tm) would happen if the data was read back in when the hashing algorithm had changed. You could encounter unexpected hash collisions, and more worryingly, the inability to find something by its hash because the hash had changed between the data being persisted and "now".

    In fact, that pretty much explains point #3 too =)

    The reason for point #1 could be "to allow interoperability". If the hashCode implementation is locked down then data can be shared between different implementations of Java quite safely. i.e, the hash of a given object will always be the same irrespective of implementation.