Search code examples
javaequalshashsethashcode

Why HashSet sometimes doesn't add object when relying on default hash and equals?


I am working with a ConcurrentHashMap<String, HashSet<MyClass>> and sometimes when adding a newed up MyClass to the set, it will fail. The constructor takes 3 parameters, 2 of which are the same instance but 1 parameter is different in instance and value. During batch runs of 500 test executions I've seen failure rates from .5% - 18% when using the default hash and equals methods provided by Java's Object. However when I generate them myself or use Lombok's @EqualsAndHashCode to create them for me, in over 250k+ tests I have never seen it fail. I dove into what's happening underneath and could not find a solid answer as to why sometimes HashSet#add will return false, ie not add the object to the set even though MyClass is being called via new constructor.

Some possible theories I've seen:

  • HashMap will use a modulus against the size of the underlying container to determine where to place it. This could create a conflict if the keys are different but I don't see any instances of HashMap using a modulus operator.

  • HashMap will use the hashCode plus its own hashing function to determine the hash of a key. Since the default hashCode may not always return the same hash, it could cause unstable results when called again in HashMap's hashing function. This one I'm more inclined to believe since when overriding the hashCode function provides a stable result.

However, I have not yet found the definitive answer as to why when using the default hashCode and equals functions from Object, does HashSet#add sometimes return false when adding Objects that contain different fields and are unique instances created via new constructor?

public class MyClass extends MyHelperClass {
    private String myString;

    public MyClass(String id, String id2) {
        super(id2);
        this.myString = id;
    }
}
public class MyHelperClass {
   private String myString;

    MyHelperClass(String myString_){
        this.myString = myString_;
    }

    @Override
    public boolean equals(Object o) {
        if (this == o) {
            return true;
        }
        if (o == null || getClass() != o.getClass()) {
            return false;
        }
        MyHelperClass that = (MyHelperClass) o;
        return Objects.equals(myString, that.myString);
    }

    @Override
    public int hashCode() {
        return Objects.hash(myString);
    }
}
public class SomeService {    
public final Map<String, Set<MyClass>> myMap = new ConcurrentHashMap<>();

public void addThis(String id1, String id2) {
        myMap.computeIfAbsent(id1, a -> new HashSet<>());
        myMap.get(id1).add(new MyClass(id2, id1));
    }
}

Test I'm runing:

    SomeService someService = new SomeService();

    void test() {
        someService.addThis("id1", "id2");
        someService.addThis("id1", "id3");
        assertThat(someService.myMap.get("id1").size()).isEqualTo(2);
    }

In typing this out I realize that the class I'm extending has Lombok generated hash and equals but the class that's inheriting does not have the methods. When I add equals and hash on the inheriting class all is peachy. Still not 100% positive on why the HashSet doesn't like it.


Solution

  • Your SomeService class from your test code will generate in the end the following objects:

    new MyClass("id2", "id1");
    new MyClass("id3", "id1");
    

    Due to inheritance the following base constructor will be called for these two objects:

    new MyHelperClass("id1");
    new MyHelperClass("id1");
    

    As you see, you call your base constructor with the same argument. And based on your equals() implementation they are the same (and have the same hash code). Check the following example code to show the issue:

    MyClass o1 = new MyClass("id2", "id1");
    MyClass o2 = new MyClass("id3", "id1");
    System.out.println(o1.hashCode());
    System.out.println(o2.hashCode());
    System.out.println(o1.equals(o2));
    

    This will generate the following output:

    104085
    104085
    true
    

    And as they are equal only once instance will be saved in your HashSet().