I have a very large number of variables in a dataframe
that represent binary outcomes (0, 1). I am trying to create a new dataframe that makes contrasts of each of the variables using an ifelse()
condition.
Here is a minimal example to replicate what I am trying achieve.
# initial dataframe
set.seed(123)
df1 <- data.frame(sf1 = sample(0:1,10, replace=T), sf2 = sample(0:1,10, replace=T), sf3 = sample(0:1,10, replace=T))
# get all pairwise combinations of col names
two_way <- combn(colnames(df1), 2, FUN=paste, collapse='&')
# create contrats of col names
dfs<-paste('ifelse(',two_way,',1,0)', sep='')
dfs
[1] "ifelse(sf1&sf2,1,0)" "ifelse(sf1&sf3,1,0)" "ifelse(sf2&sf3,1,0)"
This creates a character vector of the desired ifelse()
conditions. What I now want to do is get those conditions into a mutate
function to create a new dataframe with all my contracts. Something like this
df2 <- df1 %>%
mutate(
data.frame(
ifelse(sf1&sf2,1,0),
ifelse(sf1&sf3,1,0),
ifelse(sf2&sf3,1,0),
check.names = FALSE
)
)
df2
sf1 sf2 sf3 ifelse(sf1 & sf2, 1, 0) ifelse(sf1 & sf3, 1, 0) ifelse(sf2 & sf3, 1, 0)
1 0 1 0 0 0 0
2 0 0 0 0 0 0
3 0 1 0 0 0 0
4 0 0 1 0 0 0
5 1 0 1 0 1 0
6 1 1 1 1 1 1
7 1 1 1 1 1 1
8 0 1 1 0 0 1
9 0 1 0 0 0 0
10 1 1 0 1 0 0
How can I pass the vector of character contrasts dfs
to the mutate
function? Is this possible or is there a better way of achieving the desired outcome?
ifelse(sf1&sf2,1,0)
can also be written as as.integer(sf1&sf2)
.
I'll do this in base R using combn
. Use combn
to create combination of column names, for each combination subset the data from the original dataset and return a named list of 1/0 integers which is appended to original dataset using cbind
.
cbind(df1,combn(names(df1), 2, \(x) {
setNames(
list(as.integer(df1[[x[1]]] & df1[[x[2]]])),
paste0(x, collapse = "_")
)
}, simplify = FALSE))
# sf1 sf2 sf3 sf1_sf2 sf1_sf3 sf2_sf3
#1 0 1 0 0 0 0
#2 0 1 1 0 0 1
#3 0 1 0 0 0 0
#4 1 0 0 0 0 0
#5 0 1 0 0 0 0
#6 1 0 0 0 0 0
#7 1 1 1 1 1 1
#8 1 0 1 0 1 0
#9 0 0 0 0 0 0
#10 0 0 1 0 0 0