Search code examples
pythonpython-3.xfilteringnumpy-ndarray

How to filter the values of one np.ndarray by the elements of another


Faced with the problem of using masks for my algorithm's logic.

There are two arrays. The first - let's call it "x", is two-dimensional.

The second - let's call it "y", is one-dimensional and contains the indices of subarrays of the first one. That is, y[i] is the index to be taken from x[i].

I implemented this using a standard python generator

import numpy as np
x = np.array([[.55, .45], [0.78, .22], [.85, .15]])
y = np.array([1,0,1])
preds = np.array([x[i, y[i]] for i in range(y.shape[0])])
print(preds) #[0.45, 0.78, 0.15] <- 0.45 == x[0][1], 0.78 == x[1][0], 0.15 == x[2][1] 

However, this implementation looks very crusty. I searched through NumPy's documentation and couldn't find anything similar to my question.

Of course, you could generate a 2D mask from "y" for "x", however, speed of implementation is critical to me, so this method is not an option.

Can you tell me how to proceed in this case?


Solution

  • Here are some variations on a theme to solve it:

    import numpy as np
    x = np.array([[.55, .45], [0.78, .22], [.85, .15]])
    y = np.array([1,0,1])
    preds = x[np.arange(x.shape[0]), y] 
    

    or

    ... # as before
    preds = x[np.arange(len(y)), y]
    

    since y has to have length x.shape[0].

    Or use np.r_[<slice>]:

    preds = x[np.r_[:len(y)], y]
    

    which may or may not be more readable (as it's a lesser known element of NumPy).