Is this sentence from an OCP book correct?
A HashSet stores its elements in a hash table, which means the keys are a hash and the values are an Object.
I have created below code snippet:
class A
{
private String s;
public static void main(String[] args) throws NoSuchFieldException, IllegalAccessException {
HashSet<A> elems = new HashSet<>(Set.of(new A("abc"), new A("cba"), new A("a")));
Field field = elems.getClass().getDeclaredField("map");
field.setAccessible(true);
HashMap privateMap = (HashMap) field.get(elems);
}
public A(String s)
{
this.s = s;
}
}
And as a result I retrieved the following HashMap:
It looks like it's the object itself being a key in the HashMap instead of its hash being the key. Is there an error in the book or am I looking at something irrelevant?
Of course it is the object and not the hash. Some facts which are individually obvious or at least easy to verify:
.equals()
to verify that things are actually equal.The way hashmap/set works is as follows:
And thus this concludes:
a
and b
, if a.equals(b)
is true, then a.hashCode() == b.hashCode()
HAS to be true, or you can't use these objects in hashset/hashmap because bizarre things will happen if you do, with keys/objects you just added disappearing into thin air but still showing up when iterating through, and so on.a.hashCode() == b.hashCode()
is true, then a.equals(b)
doesn't have to be. It could just be a collision which is fine.@Override public int hashCode() { return 1; }
is valid. Indeed it is; HashMap and HashSet will operate as described, as long as the equals
method is working fine. However, such maps/sets will be O(n)
- their performance is linear to how many items are in there. Which is unexpected; the point of hashset/map is that their performance doesn't meaningfully change as they grow. HashMap/Set aren't magic - they need a good hashCode()
impl to fulfill their performance promises.If you read some docs or books that say the hashcode is the key, then that's either from a section that is oversimplifying (accent on the over - that's probably not a good thing to put in a tutorial or explanation), or the author misunderstands how this stuff works, or its a description for hashmap impl in another system/language which takes the highly dubious shortcut of equating 'equal hashes' to '.. thus equal'. Given Pigeonhole Principle and all that, this is a bad idea. If you MUST do things that way, use a good hash algo and way more bits than 32!