Search code examples
pythoncpythonpypy

In what order should Python’s list.__contains__ invoke __eq__?


Consider the following Python program:

class Foo(object):

    def __init__(self, bar):
        self.bar = bar

    def __repr__(self):
        return 'Foo(%r)' % (self.bar,)

    def __eq__(self, other):
        print('Foo.__eq__(%r, %r)' % (self, other))
        return self.bar == other

foo1 = Foo('A')
foo2 = Foo('B')
assert foo1 not in [foo2]

Under CPython 2.7.11 and 3.5.1, it prints:

Foo.__eq__(Foo('A'), Foo('B'))
Foo.__eq__(Foo('B'), 'A')

But under PyPy 5.3.1 (2.7), it prints:

Foo.__eq__(Foo('B'), Foo('A'))
Foo.__eq__(Foo('A'), 'B')

Although Python 3.5’s documentation states that equality should be symmetric “if possible”, sometimes it is not. In that case, the order of arguments to Foo.__eq__ becomes important.

So, is the above CPython behavior an implementation detail, or is it a part of list’s public interface (meaning that PyPy has a bug)? Please explain why you think so.


Solution

  • Per the language reference:

    For container types such as list, tuple, set, frozenset, dict, or collections.deque, the expression x in y is equivalent to any(x is e or x == e for e in y).

    The other examples in the same section show the same ordering for the equality test. This suggests that the comparison should be item_maybe_in_list.__eq__(item_actually_in_list), in which case this could be considered a bug in PyPy. Additionally, CPython is the reference implementation, so in any discrepancy that version wins!

    That said, you should raise it with that community to see how they feel about it.