Search code examples
pythonarraysperformancenumpymasking

Speeding up Numpy Masking


I'm still an amature when it comes to thinking about how to optimize. I have this section of code that takes in a list of found peaks and finds where these peaks,+/- some value, are located in a multidimensional array. It then adds +1 to their indices of a zeros array. The code works well, but it takes a long time to execute. For instance it is taking close to 45min to run if ind has 270 values and refVals has a shape of (3050,3130,80). I understand that its a lot of data to churn through, but is there a more efficient way of going about this?

maskData = np.zeros_like(refVals).astype(np.int16)

for peak in ind:
        tmpArr = np.ma.masked_outside(refVals,x[peak]-2,x[peak]+2).astype(np.int16)
        maskData[tmpArr.mask == False  ] += 1
        tmpArr = None

maskData = np.sum(maskData,axis=2)

Solution

  • Approach #1 : Memory permitting, here's a vectorized approach using broadcasting -

    # Craate +,-2 limits usind ind
    r = x[ind[:,None]] + [-2,2]
    
    # Use limits to get inside matches and sum over the iterative and last dim
    mask = (refVals >= r[:,None,None,None,0]) & (refVals <= r[:,None,None,None,1])
    out = mask.sum(axis=(0,3))
    

    Approach #2 : If running out of memory with the previous one, we could use a loop and use NumPy boolean arrays and that could be more efficient than masked arrays. Also, we would perform one more level of sum-reduction, so that we would be dragging less data with us when moving across iterations. Thus, the alternative implementation would look something like this -

    out = np.zeros(refVals.shape[:2]).astype(np.int16)
    x_ind = x[ind]
    for i in x_ind:
        out += ((refVals >= i-2) & (refVals <= i+2)).sum(-1)
    

    Approach #3 : Alternatively, we could replace that limit based comparison with np.isclose in approach #2. Thus, the only step inside the loop would become -

    out += np.isclose(refVals,i,atol=2).sum(-1)