Membership for list of arrays: ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all() error problem

Q = [np.array([0, 1]), np.array([1, 2]), np.array([2, 3]), np.array([3, 4])]
for q in Q:
    print(q in Q)

Running the code above, it gives me the result 'True' at the first iteration, while ValueError comes out afterwards.

True

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

I have no idea why it starts to go wrong at second iteration. Anybody can help me plz..

Solution

Essentially, you can't use in to test for numpy arrays in a Python list. It will only ever work for the first element, because of an optimisation in the way Python tests for equality.

What's happening is that the implementation for list.__contains__ (which in defers to), is using a short-cut to find a match faster, by first checking for identity. Most people using Python know this as the is operator. This is faster than == equality checks because all is has to do is see if the pointers of both objects are the same value, it is worth checking for first. An identity test works the same for any Python object, including numpy arrays.

The implementation essentially would look like this if it was written in Python:

def __contains__(self, needle):
    for elem in self:
        if needle is elem or needle == elem:
            return True
    return False

What happens for your list of numpy arrays then is this:

for q in Q, step 1: q = Q[0]
- q in Q is then the same as Q.__contains__(Q[0])
  - Q[0] is self[0] => True!
for q in Q, step 2: q = Q[1]
- q in Q is then the same as Q.__contains__(Q[1])
  - Q[1] is self[0] => False :-(
  - Q[1] == self[0] => array([False, False]), because Numpy arrays use broadcasting to compare each element in both arrays.

The array([False, False]) result is not a boolean, but if wants a boolean result, so it is passed to (C equivalent of) the bool() function. bool(array([False, False])) produces the error you see.

Or, done manually:

>>> import numpy as np
>>> Q = [np.array([0, 1]), np.array([1, 2]), np.array([2, 3]), np.array([3, 4])]
>>> Q[0] is Q[0]
True
>>> Q[1] is Q[0]
False
>>> Q[1] == Q[0]
array([False, False])
>>> bool(Q[1] == Q[0])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

You'll have to use any() and numpy.array_equal() to create a version of list.__contains__ that doesn't use (normal) == equality checks:

def list_contains_array(lst, arr):
    return any(np.array_equal(arr, elem) for elem in lst)

and you can then use that to get True for your loop:

>>> for q in Q:
...     print(list_contains_array(Q, q))
...
True
True
True
True