Age <- c(90,56,51,'NULL',67,'NULL',51)
Sex <- c('Male','Female','NULL','male','NULL','Female','Male')
Tenure <- c(2,'NULL',3,4,3,3,4)
df <- data.frame(Age, Sex, Tenure)
In the above example, there are 'NULL' values as character/string formate.
I am trying to impute NA in place of 'NULL' values. I was able to it for a single column as df$age[which(df$Age=='NULL)]<-NA'
However I don't want to write this for all columns.
How to apply similar logic to all columns so that all the 'NULL'
values of df
are converted to NAs
? I am guessing that apply
or custom defined function or for loop will do it.
We can use dplyr
to replace
the 'NULL'
values in all the columns and then convert the type of the columns with type.convert
. Currently, all the columns are factor
class (assuming that 'Age/Tenure' should be numeric/integer
class)
library(dplyr)
res <- df %>%
mutate_all(funs(type.convert(as.character(replace(., .=='NULL', NA)))))
str(res)
#'data.frame': 7 obs. of 3 variables:
#$ Age : int 90 56 51 NA 67 NA 51
#$ Sex : Factor w/ 3 levels "Female","male",..: 3 1 NA 2 NA 1 3
#$ Tenure: int 2 NA 3 4 3 3 4