Search code examples
pythondata-analysisdata-cleaning

Replacing NaN values in a column with the mode of a particular Category in that the column


df['Android Ver'].fillna(str(df.groupby('Category')['Android Ver'].mode()), inplace=True)

This piece of code is giving an error! I want to fill the NaN values in the column 'Android Ver' with the mode of the column 'Android Ver' from the particular 'Category' of apps such that the column 'Android Ver' of a Beauty app gets the mode of Android Version of the Beauty apps only in the dataset. Link to Jupyter Notebook


Solution

  • If you run df.loc[df['Android Ver'].isna()] you'll see that there are only two NaNs in the column, so in this instance you could replace them manually. But here's a (surely suboptimal) general solution:

    import pandas as pd
    df = pd.read_csv('./datasets/apps.csv', index_col=0)
    mode_dict = dict(df.groupby('Category')['Android Ver'].agg(lambda x: x.mode()))
    df['Android Ver'].loc[df['Android Ver'].isna()] = df.loc[df['Android Ver'].isna()].apply(lambda x: mode_dict[x.Category],axis=1)