Search code examples
javaserializationequalshashcodelombok

Do we need to generate equals and hashcode in sub class everytime even super class generated and serialized in java?


I have a class called User.java and the code is as follows,

public abstract class User implements Serializable {

   // serialVersionUID

   private Integer userId;
   private String userName;
   private String fullName;

   // Constructor()
   // Getters and Setters
   // equals()
   // hashCode()
}

And I have Contact.java class,

public class Contact extends User {

   // serialVersionUID

   private String phoneNumber;
   private String address;

   // Constructor()
   // Getters and Setters
}

So my question is even User class generated equals and hashcode methods do I need to override it again in the sub class Contact?

And also I am using lombok and the IDE is IntelliJ. I see that when I am generating equals and hashcode through IDE there are to select the template for example,

  • Default
  • Apache commons lang 3 like that.

When generating I see that generated hashcode is different for example,

  • lombok contains different code
  • Apache commons lang 3 contains different code

So what is the difference between each of those, difference between each generated hascode()?

What can I try to solve this?


Solution

  • Normally, yes, but first you need to think about what equality even means in the context of a User object. For example, I bet you'd consider any User where the ID is equal, to be equal, and there is no need or point to check for equality in the userName or fullName fields, or the phoneNumber field. Possibly you want to check only userName, or only the combination of userId and userName. It's not about the code you want to write, it is about what you consider the definining equality relationship between any two given user objects.

    The answer is complicated.

    There is no easy answer to this question. Therefore, let's first talk about why you'd even want to add such methods.

    What is equals/hashCode for

    The point of these two methods is to give you the ability to use instances of your User class (or your Contact class) as keys in a java.util.Map of some sort, or to put them in some java.util.List and have these types actually fulfill what their documentation tells you. If you add objects of a type without equals/hashCode impls to, say, an ArrayList and then call .contains() on this, it won't do what you expected it to. If you try to use them as keys in a map, it won't work right.

    Thus, if you aren't planning on putting User objects in lists or maps, then there is no need to fret about any of this stuff. Don't write the methods at all. Move on with life.

    Okay, but I do want to do that.

    That gets us to..

    Then think about what your type actually represents

    What does an instance of the Contact class actually represent?

    • It represents a row in a database (could even be a textfile or a non-SQL based engine); changes to the object will result in changes to the underlying DB, for example.
    • It represents a contact in a user's address book. Not 'address book' as in 'the data store that the contact app I am writing is using', but 'the actual address book'.

    Depending on your answer, the equals/hashCode will be wildly different.

    It represents a row in the DB

    For SQL-based DB engines, the DB engine has a clear and universally understood definition for equality. If you say that an instance of Contact represents a row in your DB, then it is logical that the equality definition in your java code must match that of SQL, and that definition is simply this: Equal Primary Key? Then they are equal.

    PKs can be anything and can be multiple columns, but the vast majority of DB designs use an auto-generated single numeric column for primary key, and given that you have a 'userID' as a field, this sounds like your design.

    That means the definition of equality you desire is: equality means: Same userID. But it gets worse: For e.g. hibernate you can make an instance of User and have that object exist in java memory without you ever saving it to the DB yet. That means (for auto-generated primary keys), that the object effectively doesn't have a primary key and for 2 objects without a primary key, even if they are utterly identical in all ways, that thus means they are not equal - and that is such a crazy definition of equality (equal if the userID fields are equal unless the userID fields represent the placeholder value indicating 'not saved to db yet', then not equal, even if they are identical in every way), that no auto-generated tool implements this, you'd have to write it yourself.

    It represents an entry in the address book

    Then the definition is effectively flipped: that a contact entry has a id is an implementation detail of the database and is the only field in the entire class that is not an actual intrinsic part of the contact as a concept, thus equality is probably best defined as: "Identical, except for the DB id, that does not matter as it is not an intrinsic property of a contact".

    Again you need to take some extra actions here; lombok, intellij, eclipse -- all these tools cannot know any of this, and will by default just assume you intend that 2 instances are equal only if every field is equal.

    Um.. how do I choose?

    Well, go back to what equals/hashCode is for: To make instances of this stuff function as keys in maps and for contains and such to give proper answers when these instances are stored in java lists. So, what do you want to occur if you store 2 separate instances of Contact into a list, where somehow they both have the same ID, but a different username value? Depending on your answer, you know which of the two interpretations are correct.

    Why are these tools generating different impls of hashCode?

    They don't. Not really. The point of hashCode is very simple:

    If any two given objects have different hashcodes, they cannot possibly be equal.

    That's it. That's all it means. Two objects with equal hashcodes don't need to be equal (you'd have to invoke a.equals(b) to find out), but two objects with non-equal hashcodes are not equal, and there is no need to invoke a.equals(b) to find out. This is not enforced by java at all, but you're supposed to write it that way (ensure that if 2 objects have non-equal hashcode, that they cannot be equal). If you fail to do so, instances of your classes will do bizarre things when used as e.g. keys in hashmaps.

    There are many ways to write an algorithm that leads to this effect. This explains why there are small differences. But, they are all about equally effective (they generate different hashcodes for known different objects about as efficiently, and the hashCode method runs about equally performant, for all of these different tools).

    Subtyping

    Subtyping is extremely complicated in regards to equality. That's due to the rules as documented in Object's equals and hashCode javadoc. This answer is very long already so I won't get into why, so you'll just have to do some web searching or take my word for it. However, asking tools to auto-gen equality/hashCode impls in the face of a type hierarchy is very tricky.

    It sure sounds like you want to lock down what equality (and therefore hashcode) means at the class User level (namely: equality is defined by having the same userID. Perhaps the same username if that sounds more applicable to your situation. But no more than that), in which case you should write these methods yourself, and they should be:

    public final boolean equals(Object other) {
        if (other == this) return true;
        if (this.userId == null) return false;
        if (!(other instanceof User) return false;
        return this.userId.equals(((User) other).userId);
    }
    
    public final int hashCode() {
        return userId == null ? System.identityHashCode() : 61 * userId.intValue();
    }
    

    Why?

    • These define equality by way of 'both have a userID, and they are equal'.
    • They are consistent with the rules (such as: any 2 equal objects will neccessarily have equal hashcodes).
    • They are simple.
    • They are 'final', because otherwise this gets incredibly complicated, and you can't allow subtypes to redefine equality on you, that can't work, due to the rule that a.equals(b) must match b.equals(a).
    • Why 61? It's just an arbitrarily chosen prime number. Pretty much any number would do; an arbitrary prime is very very very slightly more efficient in exotic cases.

    Actually I want the other definition

    Then use lombok (disclaimer: I am a core contributor there, so this is me rating myself), because it has the best equals implementation, and you won't have to look at the code or maintain it. Mark your userId field with the @EqualsAndHashCode.Exclude annotation, and mark the Contact class with @EqualsAndHashCode(callSuper = true), and User with @EqualsAndHashCode (or something that includes that, like @Value) - the callSuper is needed to tell lombok that the parent class has a lombok-compatible equals implementation.