I simulated a data set with the following assumptions:
x1 <- rbinom(100,0,0.5) #trt
x2 <- rnorm(100,0,1) # metric outcome
df <- data.frame(x1,x2)
Now I'm trying to include missing values with two different methods: First "missing completely at random" and second "missing not at random". Therefore I tried lots of packages, but it does not work, as I expacted.
For the first scenario (MCAR) I used:
df_mcar <- ampute(data = df, prop = 0.1, mech = "MCAR", patterns = c(1, 0))$amp
... and it seems to work (with probability of 10% only x2 has missing values - independently of x1)
For the second scenario I want - again - that only x2 has missing values, but this time with special assumption on x1: Only for x1 = 1 I want x2 to have missing values in 10% of cases.
So in variable x2 I want missing values with probability of p=0.1 for x1 = 1 and with probability of p=0 for x1 = 0.
I would be glad for any hint or a simple solution :)
PS: I often read something like prodNA(...) but it does not work
Could probably do something like:
library(dplyr)
df %>%
mutate(
x2 = if_else(x1 == 1 & runif(n()) < .1, NA_real_, x2)
)
My R
is currently too busy for me to run the code, though.