I would like to get a cumulative sum of X and also subtract X-Y in each column. The remainder should be found and added to the next row. I would also like the count to reset to 0 at the end of every month. For example, the 1st of every month should be a 0 in X.
For example:
DF1:
Date | X | Y |
---|---|---|
2021-04-25 | 4 | 4 |
2021-04-26 | 0 | 0 |
2021-04-27 | 0 | 0 |
2021-04-28 | 56 | 53 |
2021-04-29 | 0 | 0 |
2021-04-30 | 1 | 0 |
2021-05-01 | 0 | 0 |
2021-05-02 | 5 | 0 |
2021-05-03 | 5 | 5 |
2021-05-04 | 0 | 0 |
Dfdesired:
Date | X | Y |
---|---|---|
2021-04-25 | 4 | 4 |
2021-04-26 | 0 | 0 |
2021-04-27 | 0 | 0 |
2021-04-28 | 56 | 53 |
2021-04-29 | 3 | 0 |
2021-04-30 | 4 | 0 |
2021-05-01 | 0 | 0 |
2021-05-02 | 5 | 0 |
2021-05-03 | 10 | 5 |
2021-05-04 | 5 | 0 |
I have tried this for the cumulative sum but it does not seem to be working and I am unsure how to reset to 0 at the end of the month.
df1['X'] += df1['X'] - df1['Invoice Rejected']
Try this to apply the calculation on each month:
df["X"] = (df["X"] +
df.eval("Diff = X - Y").groupby(pd.Grouper(key="Date", freq="M"))["Diff"]
.cumsum().shift().fillna(0))
Output:
Date X Y
0 2021-04-25 4.0 4
1 2021-04-26 0.0 0
2 2021-04-27 0.0 0
3 2021-04-28 56.0 53
4 2021-04-29 3.0 0
5 2021-04-30 4.0 0
6 2021-05-01 4.0 0
7 2021-05-02 5.0 0
8 2021-05-03 10.0 5
9 2021-05-04 5.0 0
Details: