Search code examples
pythonpandasfunctionmeanmedian

Create a function in python, which will impute mean OR median values in the pandas dataframe


I have a dataframe

data = {'Age':[18, np.nan, 17, 14, 15, np.nan, 17, 17]} 
df = pd.DataFrame(data) 
df

I would like to write a solution, which would allow to impute either mean or median, using

df = df.fillna 
df = df.fillna(df.median())

Desired output for mean

data = {'Age':[18, 16.3, 17, 14, 15, 16.3, 17, 17]} 
df = pd.DataFrame(data) 
df

Desired output for median

data = {'Age':[18, 17, 17, 14, 15, 17, 17, 17]} 
df = pd.DataFrame(data) 
df

Solution

  • Use function:

    def f(df, func):
        if func in ['mean','median']:
            return df.fillna(df.agg(func))
        else:
            raise Exception("Wrong function, use only 'mean' or 'median'")
        
    

    If need mean use:

    df = f(df, 'mean')
    

    If need median use:

    df = f(df, 'median')