Search code examples
rmissing-datadata-cleaning

How to fulfill missing cells of a data frame in R?


I have a dataset like this.

df = data.frame( name= c("Tommy", "John", "Dan"), age = c(20, NA, NA) )

I tried to set 15 y.o. to John and Dan.

df[ ( df$age != 20) , ]$age = 15

But I got an error as follows,

Error in [<-.data.frame(tmp, (df$age != 20), , value = list(name = c(NA_integer_, : missing values are not allowed in subscripted assignments of data frames

What is a nice way to set new values to these missing cells?


Solution

  • If you want to modify all cells that are not 20, including other valid values for age, I would do the following:

    # Creating a data frame with another valid age
    df = data.frame( name= c("Tommy", "John", "Dan","Bob"), age = c(20, NA, NA,12) )
    
    # Substitute values different than 20 for 15
    df[df$age!=20 | is.na(df$age),"age"] <- 15
    
       name age
    1 Tommy  20
    2  John  15
    3   Dan  15
    4   Bob  15