I want to replace the missing value with mean value within same sex.
For example, if 'patient A - male' has missing value in pain, the missing value will be replace with mean value of pain in male.
rawdata <- rawdata %>%
mutate(replace_pain = ifelse(is.na(pain) & sex == "male",
rawdata %>%
filter(sex == "male") %>%
mean(pain, na.rm = TRUE),
ifelse(is.na(pain) & sex == "female",
rawdata %>%
filter(sex == "female") %>%
mean(pain, na.rm = TRUE),
pain)))
It has two problems.
1) Coding is a little messy.
2) It doesn't working. The error appears. Maybe, it seems there is a problem with %>%mean
code.
Warning message:
In mean.default(., pain, na.rm = TRUE) :
argument is not numeric or logical: returning NA
Is there better way to impute the missing value with condition?
Your code is not working because you have to add summarise(mean(pain, na.rm = TRUE))
not only mean(pain, na.rm = TRUE)
. You cannot use mean
on a dataframe.
rawdata %>%
mutate(replace_pain= ifelse(is.na(pain) & sex=="male",
rawdata %>% filter(sex=="male") %>% summarise(mean(pain,na.rm=TRUE)),
ifelse(is.na(pain) & sex=="female",
rawdata %>% filter(sex=="female") %>% summarise(mean(pain,na.rm=TRUE)),
pain)))
The code is still quite messy, it would be probably be nicer to define a avg_pain_female
and avg_pain_male
variable first.