I have a numpy array arr
. It's a numpy.ndarray
, size is (5553110,)
, dtype=float32
.
When I do:
(arr > np.pi )[3154950]
False
(arr[3154950] > np.pi )
True
Why is the first comparison getting it wrong? And how can I fix it?
The values:
arr[3154950]= 3.1415927
np.pi= 3.141592653589793
Is the problem with precision?
The problem is due to accuracy of np.float32
vs np.float64
.
Use np.float64
and you will not see a problem:
import numpy as np
arr = np.array([3.1415927], dtype=np.float64)
print((arr > np.pi)[0]) # True
print(arr[0] > np.pi) # True
As @WarrenWeckesser comments:
It involves how numpy decides to cast the arguments of its operations. Apparently, with
arr > scalar
, the scalar is converted to the same type as the arrayarr
, which in this case isnp.float32
. On the other hand, with something likearr > arr2
, with both arguments nonscalar arrays, they will use a common data type. That's why (arr > np.array([np.pi]))[3154950]
returnsTrue
.