Hash Functions are incredibly useful and versatile. In general, they are used to map a space to one much smaller space. Of course that means that two objects may hash to the same value (collision), but this is because you are reducing the space (pigeonhole principle). The efficiency of the function largely depends on the size of the hash space.
It comes as a surprise then that a lot of Java hashCode functions are using multiplication to produce the hash code of a new object as e.g. follows (creating-a-hashcode-method-java)
@Override
public int hashCode() {
final int prime = 31;
int result = 1;
result = prime * result + ((email == null) ? 0 : email.hashCode());
result = prime * result + (int) (id ^ (id >>> 32));
result = prime * result + ((name == null) ? 0 : name.hashCode());
return result;
}
If we want to mix two hashcodes in the same range, xor should be much better than addition and is I think traditionally used. If we wanted to increase the space, shifting by some bytes and then xoring would still imho make sense. I guess multiplying by 31 is almost the same as shifting one hash by 1 and then adding but it should be much less efficient...
As it is the recommended approach though, I think I am missing something. So my question is why would this be?
Notes:
int
type in Java can be used to represent any whole number from -2147483648 to 2147483647.The answer to this is a mixture of different factors: