Per the Java documentation, the hash code for a String
object is computed as:
s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]
using
int
arithmetic, wheres[i]
is the ith character of the string,n
is the length of the string, and^
indicates exponentiation.
Why is 31 used as a multiplier?
I understand that the multiplier should be a relatively large prime number. So why not 29, or 37, or even 97?
According to Joshua Bloch's Effective Java, Second Edition (a book that can't be recommended enough, and which I bought thanks to continual mentions on Stack Overflow):
The value 31 was chosen because it is an odd prime. If it were even and the multiplication overflowed, information would be lost, as multiplication by 2 is equivalent to shifting. The advantage of using a prime is less clear, but it is traditional. A nice property of 31 is that the multiplication can be replaced by a shift and a subtraction for better performance:
31 * i == (i << 5) - i
. Modern VMs do this sort of optimization automatically.
(from Chapter 3, Item 9: Always override hashCode
when you override equals
, page 48)