python arrays numpy floating-point floating-accuracy

Array comparison not matching elementwise comparison in numpy

I have a numpy array arr. It's a numpy.ndarray, size is (5553110,), dtype=float32.

When I do:

(arr > np.pi )[3154950]
False
(arr[3154950] > np.pi )
True

Why is the first comparison getting it wrong? And how can I fix it?

The values:

arr[3154950]= 3.1415927
np.pi= 3.141592653589793

Is the problem with precision?

Solution

The problem is due to accuracy of np.float32 vs np.float64.

Use np.float64 and you will not see a problem:

import numpy as np

arr = np.array([3.1415927], dtype=np.float64)

print((arr > np.pi)[0])  # True

print(arr[0] > np.pi)    # True

As @WarrenWeckesser comments:

It involves how numpy decides to cast the arguments of its operations. Apparently, with arr > scalar, the scalar is converted to the same type as the array arr, which in this case is np.float32. On the other hand, with something like arr > arr2, with both arguments nonscalar arrays, they will use a common data type. That's why (arr > np.array([np.pi]))[3154950] returns True.

Related github issue