The issues
So I have an array I imported containing values ranging from ~0.0 to ~0.76. When I started trying to find the min & max values using Numpy, I ran into some strange inconsistencies that I'd like know how to solve if they're my fault, or avoid if they're programming errors on the Numpy developer's end.
The code
Let's start with finding the location of the maximum values using np.max
& np.where
.
print array.shape
print np.max(array)
print np.where(array == 0.763728955743)
print np.where(array == np.max(array))
print array[35,57]
The output is this:
(74, 145)
0.763728955743
(array([], dtype=int64), array([], dtype=int64))
(array([35]), array([57]))
0.763728955743
When I look for where the array exactly equals the maximum entry's value, Numpy doesn't find it. However, when I simply search for the location of the maximum values without specifying what that value is, it works. Note this doesn't happen in np.min
.
Now I have a different issue regarding minima.
print array.shape
print np.min(array)
print np.where(array == 0.0)
print np.where(array == np.min(array))
print array[10,25], array[31,131]
Look at the returns.
(74, 145)
0.0
(array([10, 25]), array([ 31, 131]))
(array([10, 25]), array([ 31, 131]))
0.0769331747301 1.54220192172e-09
1.54^-9 is close enough to 0.0 that it seems like it would be the minimum value. But why is a location with the value 0.077 also listed by np.where
? That's not even close to 0.0 compared to the other value.
The Questions
Why doesn't np.where
seem to work when entering the maximum value of the array, but it does when searching for np.max(array)
instead? And why does np.where()
mixed with np.min()
returns two locations, one of which is definitely not the minimum value?
You have two issues: the interpretation of float
s and the interpretation of the results of np.where
.
np.where(array == 0.763728955743)
returns an empty array, while print np.where(array == np.max(array))
does the right thing. Note that the second case just uses the exact binary number internally without any conversions. The search for the minimum succeeds because 0.0
can be represented exactly in both decimal and binary. In general, it is a bad idea to compare float
s using ==
for this and related reasons.For the version of np.where
that you are using, it devolves into np.nonzero
. You are interpreting the results here because it returns an array for each dimension of the array, not individual arrays of coordinates. There are a number of ways of saying this differently:
where
for the maximum case. This is correct, but it is not what you are doing in the minimum case.There are a number of ways of dealing with these issues. The easiest could be to use np.argmax
and np.argmin
. These will return the first coordinate of a maximum or minimum in the array, respectively.
>>> x = np.argmax(array)
>>> print(x)
array([35, 57])
>> print(array[x])
0.763728955743
The only possible problem here is that you may want to get all of the coordinates.
In that case, using where
, or nonzero
is fine. The only difference from your code is that you should print
print array[10,31], array[25,131]
instead of the transposed values as you are doing.