I have two Dataframes. Dataframe1(df1): Column B is initially empty and needs to be populated as below.
| A( in days) | B (sum) |
| ----------- | ------- |
| 7 | |
| 14 | |
| 18 | |
| 25 | |
df1 has two columns: Column A in days and another column B that is blank and needs to be filled with sum of rows from dataframe 2,the number of rows to be summed is based on how days translates to week in column B.
Dateframe 2 (df2):
| week | C(weekly) |
| ---- | --------- |
| WK1 | 10 |
| WK2 | 5 |
| WK3 | 7 |
| WK4 | 4 |
| WK5 | 8 |
Dataframe 2 has two columns: This dataframe has Week numbers and weekly which is sum of weekly units.
I want to fill the values in column A of dataframe 1 based on column d in Dataframe 2 like below:
For first row when A=7 days then just use WK1 values (As 7 days is 1 week so just WK1 of C from DF2) so B=10
For second row when A=14 days (2 weeks), I need WK1 and WK2 sum for C from DF2) so B=10+5=15 For third row, when A=18,I need the sum of WK1 and WK2 and 5/7(WK3) of C from DF2) so B=10+5+(4/7)*7=19 For fourth row, when A=25, I need the sum of WK1+Wk2+wk3+(4/7)*4,so B=24.28 (3 weeks plus fraction -21day + 4/7th of week4 value)
df1 (Completed):
| A( in days) | B(sum) | Methodology |
| ----------- | ------- | --------------- |
| 7 | 10 | 10 |
| 14 | 15 | 10+5 |
| 18 | 19 | 10+5+(4/7)*7 |
| 25 | 24.28 | 10+5+7+(4/7)*4 |
I am new to python and do now know how to proceed. Please help
IIUC, you can use a mod
operation to correct the cumsum
:
r = df1['A'].mod(7).to_numpy()
c = df2['C'].head(len(df1))
df1['B'] = c.cumsum().sub((7-r)/7*c.where(r>0), fill_value=0).to_numpy()
Output:
A B
0 7 10.000000
1 14 15.000000
2 18 19.000000
3 25 24.285714
Then we need to use a custom function
def cust_sum(n, ref):
d, r = divmod(n, 7)
return ref.head(d).sum() + (ref.iat[d]*r/7 if r else 0)
df1['B'] = df1['A'].apply(cust_sum, ref=df2['C'])
Example:
A B
0 7 10.000000
1 14 15.000000
2 18 19.000000
3 29 27.142857