Search code examples
pythonroundingfillna

Python: Round() is not working with my fillna()


import seaborn
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

passengers = pd.read_csv('passengers.csv')
#passengers['Age'].fillna(value=round(passengers['Age'].mean()), inplace=True)
passengers['Age'].fillna(value=round(np.mean(passengers['Age'])), inplace=True)

Here are two different codes I tried.

The idea is to fill any na with the average age of passengers and I wanted to take it a step further by rounding the figure.

In Codecademy's terminal it worked but in my Jupyter Notebook it won't round the figure out.

Did I do something wrong?


Solution

  • Try using SimpleImputer() from sklearn Here is the working example from the official documentation:

    import numpy as np
    from sklearn.impute import SimpleImputer
    
    imp_mean = SimpleImputer(missing_values=np.nan, strategy='mean')
    imp_mean.fit([[7, 2, 3], [4, np.nan, 6], [10, 5, 9]]) # your column
    
    X = [[np.nan, 2, 3], [4, np.nan, 6], [10, np.nan, 9]]
    print(imp_mean.transform(X))
    >>
    [[ 7.   2.   3. ]
     [ 4.   3.5  6. ]
     [10.   3.5  9. ]]
    

    You can choose to have mean, median,mode etc. Please see the official documentation

    You can directly use fit_transform in one go for each individual column just like passengers['Age'] = imp_mean.fit_transform(passengers['Age'])

    Once you get the updated column, you can use the round with apply() on the column like passengers['Age'] = passengers['Age'].apply(lambda x: round(x))

    This might not be the most efficient solution, but it'll work ;)