Search code examples
pythonpandasdataframecumsum

AttributeError: 'str' object has no attribute 'cumsum'


df["Size"] should output a new cumulative total without allowing the sum to drop below 0:

         Size 
0        11.0
1        18.0
2       -13.0
3        -4.0
4       -26.0
5        30.0

print(df["Cumulative"]) output should read:

       Cumulative
0        11
1        29
2        16
3        12
4         0 
5        30

I hoped lambda might help, but I get an error:

df.Size = df.Size.astype(int)
df["Cumulative"] = df.Size.apply(lambda x: x.cumsum() if x.cumsum() > 0 else 0)
print(df)

Output:

AttributeError: 'int' object has no attribute 'cumsum'

This error appears no matter what data type is entered 'str', 'float'

alternitively I started with:

df.Size = df.Size.astype(int)
df["Cumulative"] = df.Size.cumsum()

Output:

       Cumulative
0         11
1         29
2         16
3         12
4        -14
5         16

This output worked as expected but does not stop results from dropping below 0


Solution

  • Update

    You have to use accumulate from itertools:

    from itertools import accumulate
    
    def reset_cumsum(bal, val):
        return max(bal + val, 0)  # Enhanced by @Chrysophylaxs
        # return bal if (bal := bal + val) > 0 else 0
    
    df['Cumulative'] = list(accumulate(df['Size'], func=reset_cumsum, initial=0))[1:]
    print(df)
    
    # Output
       Size  Cumulative
    0  11.0        11.0
    1  18.0        29.0
    2 -13.0        16.0
    3  -4.0        12.0
    4 -26.0         0.0
    5  30.0        30.0
    

    You can use expanding and compute the sum at each iteration. If the sum greater than 0 return the sum else return 0:

    >>> df['Size'].expanding().apply(lambda x: c if (c := x.sum()) > 0 else 0)
    0    11.0
    1    29.0
    2    16.0
    3    12.0
    4     0.0
    5    16.0
    Name: Size, dtype: float64