I've checked several websites and StackOverflow, but I wasn't able to find any documentation or help around this topic, so hoping that someone in the community might know the answer. I have a list of groceries, quantity, transaction (B for buy and S for sell) and date purchased/sold:
TRANSACTIONS Qty Transaction Date
Apple 5 B 1/1/22
Banana 5 B 1/1/22
Cereal 3 B 1/1/22
Milk 4 B 1/1/22
Banana 2 S 2/2/22
Milk 1 S 2/2/22
Toy 4 B 2/2/22
What I would like to do is generate a new dataframe that incorporates the transactions to show the quantity remaining based on the transaction and each date there was a transaction:
BALANCE Qty Date
Apple 5 1/1/22
Banana 5 1/1/22
Cereal 3 1/1/22
Milk 4 1/1/22
Apple 5 2/2/22
Banana 3 2/2/22
Cereal 3 2/2/22
Milk 3 2/2/22
Toy 4 2/2/22
Here's a one-liner solution (formatted):
new_df = (
df.pivot(index='TRANSACTIONS', columns='Date')
.pipe(lambda x: x.assign(Qty=x.Qty.fillna(0), Transaction=x.Transaction.ffill(axis=1)))
.stack()
.sort_index(level=1)
.reset_index()
.dropna()
.pipe(lambda x: x.assign(Qty=x['Qty'].where(x['Transaction'] == 'B', -x['Qty'])))
.pipe(lambda x: x.assign(Qty=x.groupby('TRANSACTIONS')['Qty'].cumsum().astype(int)))
.drop(['Transaction'], axis=1)
.rename({'TRANSACTIONS': 'BALANCE'}, axis=1)
)
Output:
>>> new_df
BALANCE Date Qty
0 Apple 1/1/22 5
1 Banana 1/1/22 5
2 Cereal 1/1/22 3
3 Milk 1/1/22 4
5 Apple 2/2/22 5
6 Banana 2/2/22 3
7 Cereal 2/2/22 3
8 Milk 2/2/22 3
9 Toy 2/2/22 4