Can someone please explain why the following code results in a ValueError?
import heapq
import numpy as np
a = np.ones((2, 2), dtype=int)
states = []
heapq.heappush(states, (0, a))
heapq.heappush(states, (0, a.copy()))
The error message is:
Traceback (most recent call last):
File "x.py", line 8, in <module>
heapq.heappush(states, (0, a.copy()))
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
Running it without adding the a.copy()
to the heap works fine, the second/subsequent one is for some reason a problem. I do understand that there is a unknown truth value aspect with an array of [True, False, True]
and that it's not possible to determine a single True
or False
from that, but why does heapq
need to do that? Especially only in the second case?
TL;DR: Because numpy arrays can't be cast to a boolean if they contain more than one element.
Some informations about heaps:
Heaps "order" their contents (so the items must implement <
but that's an implementation detail).
You however insert items into the heap
by creating tuple
s for the items, where the first elements is some value and the second one is the array.
Comparing tuples first checks if the first items are equal and if they are it checks if the second items are equal and so on until they are not equal, then it will check if it's smaller (when the operation was <
) or greater (for >
). However tuples are implemented in C and the ==
check there is a bit different than the one in Python. It uses PyObject_RichCompareBool
. Especially the "note" is important here
If
o1
ando2
are the same object,PyObject_RichCompareBool()
will always return 1 forPy_EQ
and 0 forPy_NE
.
Now let's go to numpy arrays:
You can't convert a numpy.array
to a bool
if it contains more than one item:
>>> arr = np.array([1,2,3])
>>> bool(arr)
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
An if
check implicitly converts the condition to a boolean:
>>> if arr: pass
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
Even after comparing numpy-arrays they are still numpy arrays:
>>> arr > arr
array([False, False], dtype=bool)
>>> arr == arr
array([ True, True], dtype=bool)
So these can't be evaluated with ==
:
>>> if arr == arr: pass
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
So you can't convert numpy-arrays with more than one element to a boolean! However now comes the interesting part: The heapq
-module uses PyObject_RichCompareBool()
so it can check if two arrays are equal but if and only if these are identical!
That's why it works with the same array passed in several times but it fails when you copy it:
>>> arr is arr
True
>>> arr is arr.copy()
False