Search code examples
pythonpandassubtractioncumulative-sum

Subtract two columns and show outcome in the next line


I would like to get a cumulative sum of X and also subtract X-Y in each column. The remainder should be found and added to the next row. I would also like the count to reset to 0 at the end of every month. For example, the 1st of every month should be a 0 in X.

For example:

DF1:

Date X Y
2021-04-25 4 4
2021-04-26 0 0
2021-04-27 0 0
2021-04-28 56 53
2021-04-29 0 0
2021-04-30 1 0
2021-05-01 0 0
2021-05-02 5 0
2021-05-03 5 5
2021-05-04 0 0

Dfdesired:

Date X Y
2021-04-25 4 4
2021-04-26 0 0
2021-04-27 0 0
2021-04-28 56 53
2021-04-29 3 0
2021-04-30 4 0
2021-05-01 0 0
2021-05-02 5 0
2021-05-03 10 5
2021-05-04 5 0

I have tried this for the cumulative sum but it does not seem to be working and I am unsure how to reset to 0 at the end of the month.

df1['X'] += df1['X'] - df1['Invoice Rejected']

Solution

  • Try this to apply the calculation on each month:

    df["X"] = (df["X"] + 
               df.eval("Diff = X - Y").groupby(pd.Grouper(key="Date", freq="M"))["Diff"]
                 .cumsum().shift().fillna(0))
    

    Output:

            Date     X   Y
    0 2021-04-25   4.0   4
    1 2021-04-26   0.0   0
    2 2021-04-27   0.0   0
    3 2021-04-28  56.0  53
    4 2021-04-29   3.0   0
    5 2021-04-30   4.0   0
    6 2021-05-01   4.0   0
    7 2021-05-02   5.0   0
    8 2021-05-03  10.0   5
    9 2021-05-04   5.0   0
    

    Details:

    1. Create a "Diff" column temporary
    2. Use pd.Grouper to group on "Date" with a freq="M" and cumsum
    3. Shift and fillna
    4. Add back to "X"