Search code examples
c#iequatable

IEquatable<T>.Equals from MSDN


I'm looking at IEquatable.Equals on MSDN. Specifically the section for the two Equals operators:

public override bool Equals(Object obj)
{
   if (obj == null)
      return false;

   Person personObj = obj as Person;
   if (personObj == null)
      return false;
   else
      return Equals(personObj);
}

public static bool operator == (Person person1, Person person2)
{
   if (((object)person1) == null || ((object)person2) == null)
      return Object.Equals(person1, person2);

   return person1.Equals(person2);
}

The line I'm struggling with is:

if (((object)person1) == null || ((object)person2) == null)
   return Object.Equals(person1, person2);
  1. Why the cast to object before checking if it's null? Is there more to it or could it just as easily been expressed if ( person1 == null || person2 == null )?
  2. Why the call to Object.Equals? Surely if one of the items is null then it's false?

The way I see it,

if (((object)person1) == null || ((object)person2) == null)
   return Object.Equals(person1, person2);

is just a convoluted way of writing:

if ( person1 == null || person2 == null )
   return false;

or am I missing something?


Solution

  • Why the cast to object before checking if it's null?

    You're in an overload of == so you'd call back into that overload if you didn't cast, and there'd be infinite recursion with a stack-overflow (or worse, if it managed to be tail-call optimised, an infinite loop). You need to make sure you call object's == not your own.

    Why the call to Object.Equals? Surely if one of the items is null then it's false?

    Not if they're both null, then it's true.

    As such it makes sense. I don't recommend it though. It's simpler just to do that null check in its entirety itself:

    if ((object)person1 == null)
      return (object)person2 == null;
    if ((object)person2 == null)
      return false; // we already know person1 isn't null
    // Follow with rest of logic.
    

    There are a few different patterns we can do here. We could just as well do:

    if ((object)person1 == null && (object)person2 == null) return true;
    if ((object)person1 == null || (object)person2 == null) return false;
    // Follow with rest of logic.
    

    So far, much of a muchness with just one extra comparison. Still checking if a reference is the same as another (including checking if it's null) is cheap. Let's get rid of branches so:

    if ((object)person1 == null & (object)person2 == null) return true;
    if ((object)person1 == null | (object)person2 == null) return false;
    // Follow with rest of logic.
    

    The extra cost of the potentially unnecessary check in each line is likely less than the cost of the branch on whether or not to do it, so this is a win.

    But now consider that the first line is checking if they're both null. Really that's just a subset of the cases of them both being the same instance. Let's just check for that instead:

    if ((object)person1 == (object)person2) return true;
    if ((object)person1 == null | (object)person2 == null) return false;
    // Follow with rest of logic.
    

    Now as well as handling the case where they are both null, I handle the case where they are both the same object. Since its the same reference-identity check either way this pretty much adds no cost to the method, but if we have to check lots of things to be sure two items are equal (consider checking two very large strings and only knowing they're the same after checking each character or collation unit) it gives us a fast true on what could have been a very slow true.

    Now let's consider that Equals() is where we'll have most of this logic. If we apply the above we've a choice between:

    public static bool operator == (Person person1, Person person2)
    {
      if ((object)person1 == (object)person2)
        return true;
      if ((object)person1 == null | (object)person2 == null)
        return false;
      return person1.Equals(person2);
    }
    

    And

    public static bool operator == (Person person1, Person person2)
    {
      if ((object)person1 == (object)person2)
        return true;
      return ((object)person1 != null  && person1.Equals(person2);
    }
    

    The latter depends upon the fact that person1.Equals(person2) is going to check that person2 isn't null anyway. The former would (because it avoids a branch) likely be a slight win when person2 is null, but the latter would likely be a slight win otherwise, and is a bit more concise. I'd generally go for the latter.

    As such, the use of object.Equals() in the example you quote is valid, but it's not the approach I'd recommend.


    Incidentally the override they suggest for object.Equals() I would not recommend at all:

    public override bool Equals(Object obj)
    {
      if (obj == null)
        return false;
    
      Person personObj = obj as Person;
      if (personObj == null)
        return false;
      else
        return Equals(personObj);
    }
    

    If you cut out the first null check then the second one would still catch that case.

    If you cut out the second null check then the call to Equals() (the overloaded one that takes a Person) will catch it.

    As such it should just be:

    public override bool Equals(object obj)
    {
      return Equals(obj as Person);
    }
    

    That pattern will serve as the Equals(object) override for any class implementing IEquatable<T> (there might be cases where you wanted to consider the object equal to one of a different type, but those are rare and often wrong even when they are done, so should be perhaps considered a very specialised case). For any struct you can use:

    public override bool Equals(object obj)
    {
      return obj is TheStructType && Equals((TheStructType)obj);
    }