Search code examples
pythonnumpyindexingnaninequality

inequality comparison of numpy array with nan to a scalar


I am trying to set members of an array that are below a threshold to nan. This is part of a QA/QC process and the incoming data may already have slots that are nan.

So as an example my threshold might be -1000 and hence I would want to set -3000 to nan in the following array

x = np.array([np.nan,1.,2.,-3000.,np.nan,5.])

This following:

x[x < -1000.] = np.nan

produces the correct behavior, but also a RuntimeWarning, but the overhead of disabling the warning

warnings.filterwarnings("ignore")
...
warnints.resetwarnings()

is kind of heavy an potentially a bit unsafe.

Trying to index twice with fancy indexing as follows doesn't produce any effect:

nonan = np.where(~np.isnan(x))[0]
x[nonan][x[nonan] < -1000.] = np.nan

I assume this is because a copy is made due to the integer index or the use of indexing twice.

Does anyone have a relatively simple solution? It would be fine to use a masked array in the process, but the final product has to be an ndarray and I can't introduce new dependencies. Thanks.


Solution

  • Any comparison (other than !=) of a NaN to a non-NaN value will always return False:

    >>> x < -1000
    array([False, False, False,  True, False, False], dtype=bool)
    

    So you can simply ignore the fact that there are NaNs already in your array and do:

    >>> x[x < -1000] = np.nan
    >>> x
    array([ nan,   1.,   2.,  nan,  nan,   5.])
    

    EDIT I don't see any warning when I ran the above, but if you really need to stay away from the NaNs, you can do something like:

    mask = ~np.isnan(x)
    mask[mask] &= x[mask] < -1000
    x[mask] = np.nan