I am trying to get a weighted average from a dt, but neither apply nor agg seems to work, and my code returns the following error 'numpy.float64' object is not callable
I have the following df
df = pd.DataFrame([['RETIRO', 65, 1, 10.7],
['SAN NICOLAS',116, 1, 23.2],
['RETIRO', 101, 2, 28.7],
['FLORES', 136 , 2, 23.5]],
columns=['BARRIO', 'HOGARES', 'COMUNA', 'NSE'])
I define the function
def avg_w(dt):
return np.average(a = dt.NSE, weights = dt.HOGARES)
and now apply it to my df,
df.loc[:,['COMUNA','NSE','HOGARES']].groupby(['COMUNA']).apply(avg_w(df))
and it returns 'numpy.float64' object is not callable
I tried also something similar to the suggestions found in here and here
I changed the function,
def avg_w2(dt):
return pd.Series({'avg_w2': np.average(a = dt.NSE, weights = dt.HOGARES)})
and the apply
df.loc[:,['COMUNA','NSE','HOGARES']].groupby(['COMUNA']).apply({'avgw': [avg_w2(dt)]})
But it didn't work either. The code returns TypeError: unhashable type: 'dict'
The function works alone but something is not working when I passed it to apply (or aggregate, I tried with both of them)
I am expecting to obtain for each COMUNA the NSE average weighted by HOGARES.
Seems like what you want is the following:
df = df.iloc[:, 1:].groupby(by="COMUNA").apply(
lambda grp : np.average(a=grp['NSE'], weights=grp["HOGARES"])
)
Which results in the following dataframe:
COMUNA
1 18.711050
2 25.716034
Note: you may use a function instead of the lambda expression to apply it to each group, but you need to pass the function name itself i.e df.apply(avg_w2)
NOT df.apply(avg_w2(df))