Search code examples
pythonnumpynanmasked-array

Masked values in numpy digitize


I want that numpy digitize ignores some values in my array. To achieve this I replaced the unwanted values by NaN and masked the NaN values:

import numpy as np
A = np.ma.array(A, mask=np.isnan(A))

Nonetheless np.digitize throws the masked values out as -1. Is there an alternative way so that np.digitize ignores the masked values (or NaN)?


Solution

  • I hope its not meant to be a performance optimization otherwise you can just mask after the digitize function:

    import numpy as np
    
    A = np.arange(10,dtype=np.float)
    A[0] = np.nan
    A[-1] = np.nan
    
    bins = np.array([1,2,7])
    
    res = np.digitize(A,bins)
    
    # here np.nan is assigned to the highes bin 
    # using numpy '1.17.2'
    print(res)
    
    # sp you mask you array after the execution of 
    # np.digitize
    print(res[~np.isnan(A)])
    
    >>> [3 1 2 2 2 2 2 3 3 3]
    >>> [1 2 2 2 2 2 3 3]