I have a pandas dataframe with two columns like this,
Item Value
0 A 7
1 A 2
2 A -6
3 A -70
4 A 8
5 A 0
I want to cumulative sum over the column, Value
. But while creating the cumulative sum if the value becomes negative I want to reset it back to 0.
I am currently using a loop shown below to perform this,
sum_ = 0
cumsum = []
for val in sample['Value'].values:
sum_ += val
if sum_ < 0:
sum_ = 0
cumsum.append(sum_)
print(cumsum) # [7, 9, 3, 0, 8, 8]
I am looking for a more efficient way to perform this in pure pandas.
This can be done using numpy
but is slower than the numba
solution.
sumlm = np.frompyfunc(lambda a,b: 0 if a+b < 0 else a+b,2,1)
newx=sumlm.accumulate(df.Value.values, dtype=np.object)
newx
Out[147]: array([7, 9, 3, 0, 8, 8], dtype=object)
Here is the numba
solution
from numba import njit
@njit
def cumli(x, lim):
total = 0
result = []
for i, y in enumerate(x):
total += y
if total < lim:
total = 0
result.append(total)
return result
cumli(df.Value.values,0)
Out[166]: [7, 9, 3, 0, 8, 8]