Search code examples
pythonoopcomparison-operatorsobject-identity

Purpose of overloading identity operator


Why isn't it possible in Python to overload the identity comparison operator? Every other comparison operator is possible to customize, so why not identity comparison?


Solution

  • Programming languages that support objects with mutable state typically provide an operator that can test whether two objects are, in fact, the same object. "Same," in this case, means that the objects are actually the same object (e.g., the same chunk of bytes in memory (or however the compiler designers choose to represent objects). For many types of data structures, though, there are other types of equivalence relations that might be more salient for the programmer. For instance, given a List interface, the programmer might only care about whether two lists contain equivalent elements in the same order. This only really matters in the case that there is some way in which two Lists with equivalent elements can be distinguished. Because many programming languages support mutable state, operations that mutate the state of an object are just such a way in which such objects could be distinguished.

    For instance, given an mutable implementation of lists we might have:

    x = make a list of 1 2 3
    y = x
    z = make a list of 1 2 3 4
    
    x same as y?  yes.
    x equal to y? yes.
    x same as z?   no.
    x equal to z?  no.
    
    add 4 to end of x
    
    x same as y?  yes.
    x equal to y? yes.
    x same as z?   no.
    x equal to z? yes. ##
    

    In a functional programming language that doesn't have mutable state, or even in a language that does have mutable state, but in which we use functional styles, we wouldn't destructively modify a list like this, but rather the add operation would return a new list (probably sharing structure with the others). In such a case, it would be possible to have only one list for any sequence of elements, so we could have:

    x = make a list of 1 2 3
    y = x
    z = make a list of 1 2 3 4
    
    x same as y?  yes.
    x equal to y? yes.
    x same as z?   no.
    x equal to z?  no.
    
    x' = add 4 to end of x
    
    x same as y?  yes.
    x equal to y? yes.
    x same as z?   no.
    x equal to z?  no.
    x same as x'?  no.
    x equal to x'? no.
    
    x' same as x?    no.
    x' equal to x?   no.
    x' same as z?   yes. ## or no, depending on implementation
    x' equal to z?  yes.
    x' same as x'?  yes.
    x' equal to x'? yes.
    

    The fact that

    x same as y?  yes.
    x equal to y? yes.
    x same as z?   no.
    x equal to z?  no.
    

    stays the same throughout can be useful in reasoning about the behavior of programs.

    When we're programming in a object-oriented fashion, object identity is an important concept, and is really one of the primitives of the language, just like boolean operators, or numeric comparison. If it can be overridden, a whole class of optimizations cannot be performed, and you could introduce some pretty hard to track down bugs. For instance, consider the (perhaps contrived example):

    # frob x and y, but never frob an object twice
    frobBoth x y
      if x same as y         # **
        frob x
      else
        frob x
        frob y
      end if
    

    If you could override same as, then you might not frob both x and y, because same as could return true even though x and y aren't the same object.

    In languages where object identity can be important, there needs to be a object identity operator that cannot be overridden. It's also typically useful to introduce an equality operator that can be overridden in some way so that it's easy to check whether two objects are equivalent in some useful way (and this will be specific to the type of object).

    • In Python, the identity operator is is and the equality operator is ==, which can be customized by the __eq__ method.
    • As another example, in Java the identity operator is ==, and the equality operator is Object.equals(Object).

    It's worth noting that in many languages, the equality operator's default implementation is object identity. This can be nice because object identity is typically much quicker to test than other, more complex, equality relations.