I want to add a series of dummy variables in a data frame for each value of x in that data frame but containing an NA if another variable is NA. For example, suppose I have the below data frame:
x <- seq(1:5)
y <- c(NA, 1, NA, 0, NA)
z <- data.frame(x, y)
I am looking to produce:
I can't seem to figure out how to vectorize this. I am looking for a solution that can be used for a large count of values of x.
There was some confusion that I wanted to iterate through each index of x. I am not looking for this, but rather for a solution that creates a variable for each unique value of x. When taking the below data as an input:
x <- c(1,1,2,3,9)
y <- c(NA, 1, NA, 0, NA)
z <- data.frame(x, y)
I am looking for z$var1, z$var2, z$var3, z$var9 where z$var1 <- c(1, 1, NA, 0, NA) and z$var2 <- c(NA, 0, 1, 0, NA). The original solution produces z$var1 <- z$var2 <- c(1,1,NA,0,NA).
You can use the ifelse
which is vectorized to construct the variables:
cbind(z, setNames(data.frame(sapply(unique(x), function(i) ifelse(x == i, 1, ifelse(is.na(y), NA, 0)))),
paste("var", unique(x), sep = "")))
x y var1 var2 var3 var9
1 1 NA 1 NA NA NA
2 1 1 1 0 0 0
3 2 NA NA 1 NA NA
4 3 0 0 0 1 0
5 9 NA NA NA NA 1
Update:
cbind(z, data.frame(sapply(unique(x), function(i) ifelse(x == i, 1, ifelse(is.na(y), NA, 0)))))
x y X1 X2 X3 X4
1 1 NA 1 NA NA NA
2 1 1 1 0 0 0
3 2 NA NA 1 NA NA
4 3 0 0 0 1 0
5 9 NA NA NA NA 1