Having a hard time understanding why the apply function isn't working here. I'm trying to fill the null values for SalePrice with the mean sales price of their corresponding quality ratings (OverallQual)
I expected the function to itterate through each row and return the mean SalePrice for the coresponding OverallQual feature where SalePrice is a null, else return the original SalePrice.
sale_price_by_qual = df.groupby('OverallQual').mean()['SalePrice']
def fill_sales_price(SalePrice, OverallQual):
if np.isnan(SalePrice):
return sale_price_by_qual[SalePrice]
else:
return SalePrice
df[SalePrice] = df.apply(lambda x: fill_sales_price(x['SalePrice], x['OverallQaul]), axis=1)
KeyError: nan
Try this,
def fill_sales_price(SalePrice, OverallQual):
if np.isnan(SalePrice):
return sale_price_by_qual[OverallQual]
else:
return SalePrice
df['SalePrice'] = df.apply(lambda x: fill_sales_price(x['SalePrice'], x['OverallQual']), axis=1)