Search code examples
pandasfillna

Cannot fill in blank values in Pandas


I have a dataframe

Gender

0 Female
1 Female
2
3 Female
4 Female

with gender column which has some na values, and the split between genders is:

Male      5453
Female    4543
Name: Gender, dtype: int64

When trying to fill in the missing values with the vale male, because it's the most common, using this code:

data['Gender'] = data['Gender'].fillna(data['Gender'].value_counts().idxmax)

I just seem to get the same values:

data['Gender'].value_counts()

Male                                                                                          5453
Female                                                                                        4543
<bound method Series.idxmax of Male      5453\nFemale    4543\nName: Gender, dtype: int64>       4
Name: Gender, dtype: int64

It seems no change has been made - as far as couns go, but

data.isnull().any()

results in False

Then when I try to change the datatype to category:

data['Gender'] = data['Gender'].astype('category')

I get this error:

TypeError: 'Series' objects are mutable, thus they cannot be hashed

Solution

  • As already said by Tserenjamts, most likely that happens because the value you want to fill is not an NaN rather it is an empty string. Also there is an error in your code, so that your code wouldn't fill the NaN's with the most frequent value, but the idmax object.

    Try this to fix your error:

    data['Gender'].replace('',np.NaN).fillna(data['Gender'].value_counts().idxmax())