Search code examples
rdata-analysis

How can I get the average (mean) of selected columns and impute the NA's


this question can be viewed as extension to the thread below: (How can I get the average (mean) of selected columns). How do we impute the missing values ie., NA"s using the mean of the selected columns.


Solution

  • One option is na.aggregate from zoo to impute the missing values (NA) with the mean value of that column. We loop through the selected columns of dataset (lapply(df1[4:8], .), apply the function and then update the columns on the lhs of <-

    library(zoo)
    df1[4:8] <- lapply(df1[4:8], na.aggregate)
    

    If we need the median, use the FUN as median (by default it is mean)

    df1[4:8] <- lapply(df1[4:8], na.aggregate, FUN = median)