Search code examples
pythonarraysnumpycontainment

Two similar array-in-array containment tests. One passes, the other raises a ValueError. Why?


Moar noob Python questions

I have a list of NumPy arrays and want to test if two arrays are inside. Console log:

>>> theArray
[array([[[213, 742]]], dtype=int32), array([[[127, 740]],
       [[127, 741]],
       [[128, 742]],
       [[127, 741]]], dtype=int32)]

>>> pair[0]
array([[[213, 742]]], dtype=int32)

>>> pair[1]
array([[[124, 736]]], dtype=int32)

>>> pair[0] in theArray
True

>>> pair[1] in theArray
Traceback (most recent call last):
  File "...\pydevd_exec2.py", line 3, in Exec
    exec(exp, global_vars, local_vars)
  File "<input>", line 1, in <module>
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

pair[0] and pair[1] seem to have absolutely similar characteristics according to the debugger (except the contents). So how are these two cases different? Why could the second one fail while the first does not?


Solution

  • Using in at all here is a mistake.

    theArray isn't an array. It's a list. in for lists assumes that == is an equivalence relation on its elements, but == isn't an equivalence relation on NumPy arrays; it doesn't even return a boolean. Using in here is essentially meaningless.

    Making theArray an array wouldn't help, because in for arrays makes basically no sense.

    pair[0] in theArray happens to not raise an exception because of an optimization lists perform. Lists try an is comparison before == for in, and pair[0] happens to be the exact same object as the first element of theArray, so the list never gets around to trying == and being confused by its return value.


    If you want to check whether a specific object obj is one of the elements of a list l (not just ==-equivalent to one of the elements, but actually that object), use any(obj is element for element in l).

    If you want to check whether a NumPy array is "equal" to an array in a list of arrays in the sense of having the same shape and equal elements, use any(numpy.array_equal(obj, element) for element in l).