Search code examples
pythonarraysnumpyfiltermasking

Filtering numpy arrays based on the index of certain value


I have a numpy array like:

array([[ 1, 17, 33, ..., 28,  9, 22],
       [ 3, 11,  1, ..., 25, 45, 14],
       [ 3, 11,  1, ..., 21, 23,  5],
       ...,
       [20,  6, 27, ..., 43, 15, 14],
       [27,  6, 39, ..., 37, 17,  2],
       [ 3, 11,  8, ..., 27, 35, 32]], dtype=int32)

From here, I would like to filter out rows where value 4 is occurring at index of 10 or before.

e.g. [1, 2, 3, 4, ..., 34, 35] - filter, as value 4 is occurring at index3, which is before index 10

e.g. [35, 34, 33, 32..., 4, 3, 2, 1] - keep, as value 4 is occurring after index10.

what would be the way to achieve this filtering using numpy masking?


Solution

  • You can try this :

    arr = np.array(
        [[1, 2, 3, 4, 34, 35, 2, 1],
         [1, 2, 3, 5, 6, 7, 8, 9],
         [1, 2, 4, 5, 6, 7, 8, 9],
         [4, 2, 3, 5, 6, 7, 8, 9],
         [35, 34, 33, 32, 4, 3, 2, 1]]
    )
    
    V, T = 4, 3 # <-- change the threshold to 10
    
    m = np.any(arr[:, :T+1] == V, axis=1)
    
    out = arr[~m]
    

    Output :

    print(out)
    
    array([[ 1,  2,  3,  5,  6,  7,  8,  9],
           [35, 34, 33, 32,  4,  3,  2,  1]])
    

    Intermediates :

    >>> arr[:, :T+1]
    array([[ 1,  2,  3,  4],
           [ 1,  2,  3,  5],
           [ 1,  2,  4,  5],
           [ 4,  2,  3,  5],
           [35, 34, 33, 32]])
    
    >>> arr[:, :T+1] == V
    array([[False, False, False,  True],
           [False, False, False, False],
           [False, False,  True, False],
           [ True, False, False, False],
           [False, False, False, False]])
    
    >>> np.any(arr[:, :T+1] == V, axis=1)
    array([ True, False,  True,  True, False])