Search code examples
pythonarraysnumpycumsum

cumulative sum in numpy array with condition for stop


I want to optimize my numpy code, Im using large arrays so efficiency is required. I tried to omit using for-looop if possible. Let`s assume simple 2-d array

1 3 5
2 0 1
5 6 2

My task is to choose this values from columns until cumsum reaches certain value (cutting values to it if needed). Lets, name this value as clip. So after this operation I`ll have array like this:

1 3 3
2 0 0
0 0 0 

I get an, rather naive idea, to calculate it with simple transformations:

array_clipped = np.clip(array, 0, clip)
array_clipped_cumsum = np.cumsum(array_clipped, axis=0)
difference = clip - cumsum
difference_trimmed = np.where(difference<0, temp, 0)
final = array_clipped + difference_trimmed
final_clean = np.where(final>=0, final, 0)

As this code works, it looks very dirty and non-numpy.


Solution

  • Here is another one-liner:

    A = np.random.randint(0,10,(6,4))
    A
    # array([[0, 8, 7, 6],
    #        [3, 2, 0, 4],
    #        [5, 6, 6, 4],
    #        [4, 5, 0, 3],
    #        [7, 9, 6, 8],
    #        [0, 9, 8, 3]])
    cap = 15
    
    np.diff(np.minimum(A.cumsum(0),cap),axis=0,prepend=0)
    # array([[0, 8, 7, 6],
    #        [3, 2, 0, 4],
    #        [5, 5, 6, 4],
    #        [4, 0, 0, 1],
    #        [3, 0, 2, 0],
    #        [0, 0, 0, 0]])
    

    Or in two lines avoiding the slow prepend:

    out = np.minimum(A.cumsum(0),cap)
    out[1:] -= out[:-1]
    out
    # array([[0, 8, 7, 6],
    #        [3, 2, 0, 4],
    #        [5, 5, 6, 4],
    #        [4, 0, 0, 1],
    #        [3, 0, 2, 0],
    #        [0, 0, 0, 0]])