I have a dataframe
Gender
0 Female
1 Female
2
3 Female
4 Female
with gender column which has some na values, and the split between genders is:
Male 5453
Female 4543
Name: Gender, dtype: int64
When trying to fill in the missing values with the vale male, because it's the most common, using this code:
data['Gender'] = data['Gender'].fillna(data['Gender'].value_counts().idxmax)
I just seem to get the same values:
data['Gender'].value_counts()
Male 5453
Female 4543
<bound method Series.idxmax of Male 5453\nFemale 4543\nName: Gender, dtype: int64> 4
Name: Gender, dtype: int64
It seems no change has been made - as far as couns go, but
data.isnull().any()
results in False
Then when I try to change the datatype to category:
data['Gender'] = data['Gender'].astype('category')
I get this error:
TypeError: 'Series' objects are mutable, thus they cannot be hashed
As already said by Tserenjamts, most likely that happens because the value you want to fill is not an NaN rather it is an empty string. Also there is an error in your code, so that your code wouldn't fill the NaN's with the most frequent value, but the idmax object.
Try this to fix your error:
data['Gender'].replace('',np.NaN).fillna(data['Gender'].value_counts().idxmax())