I am trying to add a column with the weighted average of 4 columns with 4 columns of weights
df = pd.DataFrame.from_dict(dict([('A', [2000, 1000, 2509, 2145]),
('A_Weight', [37, 47, 33, 16]),
('B', [2100, 1500, 2000, 1600]),
('B_weights', [17, 21, 6, 2]),
('C', [2500, 1400, 0, 2300]),
('C_weights', [5, 35, 0, 40]),
('D', [0, 1600, 2100, 2000]),
('D_weights', [0, 32, 10, 5])]))
I want the weighted average to be in a new column named "WA" but every time I try it displays NaN
Desired Dataframe would be a new column with the following values as ex:
Formula I used (((A * A_weight)+(B * b_weight)+(C * C_weight)+(D * D_weight)) / sum(all weights)
df['WA'] = [2071.19,1323.70, 2363.20,2214.60 ]
Thank you
A straight-forward and simple way to do is as follows:
(Since your columns name for the weights are not consistently named, e.g. some with 's' and some without, some with capital 'W' and some with lower case 'w', it is not convenient to group columns e.g. by .filter()
)
df['WA'] = ( (df['A'] * df['A_Weight']) + (df['B'] * df['B_weights']) + (df['C'] * df['C_weights']) + (df['D'] * df['D_weights']) ) / (df['A_Weight'] + df['B_weights'] + df['C_weights'] + df['D_weights'])
Result:
print(df)
A A_Weight B B_weights C C_weights D D_weights WA
0 2000 37 2100 17 2500 5 0 0 2071.186441
1 1000 47 1500 21 1400 35 1600 32 1323.703704
2 2509 33 2000 6 0 0 2100 10 2363.204082
3 2145 16 1600 2 2300 40 2000 5 2214.603175