Assume we have a following trivial class:
public class Foo {
public Integer bar;
}
And we want to build a "good" hashCode
method for it. By "good" I would, for instance mean that there's a small probability of hash code collision in "real-life" cases.
In "real-life" for such a class I would reasonably expect Foo
s with bar
set to null
or 0
. I'd even argue that these two are probably the most frequent values.
But let's take a look at what Eclipse, for instance, generates:
public class Foo {
public Integer bar;
@Override
public int hashCode() {
final int prime = 31;
int result = 1;
result = prime * result + ((bar == null) ? 0 : bar.hashCode());
return result;
}
}
And it's not just Eclipse, it seems like using 0
as hashCode
for null
is a normal practice.
But this would produce identical hash codes for null
and 0
, wouldn't it? An as I'm assuming that null
and 0
are probably the most frequent cases - this leads to a higher probabiliy of collission.
So here comes my question. What would be a good hashCode
value for null
?
From Joshua Bloch's excellent book Effective Java, 2nd Edition (page 49):
If the value of the field is
null
, return0
(or some other constant, but0
is traditional).
So you can use any constant of your choice, but usually, 0
is used as the hash code of null
.
In your case, where 0 appear frequently, it might indeed be better to choose a different constant than 0 (one that doesn't appear as a valid value in your field) to avoid collisions.