Search code examples
rimputation

Imputation in R


I am new in R programming language. I just wanted to know is there any way to impute null values of just one column in our dataset. Because all of imputation commands and libraries that I have seen, impute null values of the whole dataset.


Solution

  • Here is an example using the Hmisc package and impute

    library(Hmisc)
    DF <- data.frame(age = c(10, 20, NA, 40), sex = c('male','female'))
    
    # impute with mean value
    
    DF$imputed_age <- with(DF, impute(age, mean))
    
    # impute with random value
    DF$imputed_age2 <- with(DF, impute(age, 'random'))
    
    # impute with the media
    with(DF, impute(age, median))
    # impute with the minimum
    with(DF, impute(age, min))
    
    # impute with the maximum
    with(DF, impute(age, max))
    
    
    # and if you are sufficiently foolish
    # impute with number 7 
    with(DF, impute(age, 7))
    
     # impute with letter 'a'
    with(DF, impute(age, 'a'))
    

    Look at ?impute for details on how the imputation is implemented