I thought of the following example to illustrate my question.
Suppose there are 5 balls:
Normally, there are 5! = 120 ways these balls can be organized (n!). I can enumerate all these combinations below:
library(combinat)
library(dplyr)
my_list = c("Red", "Blue", "Green", "Yellow", "Orange")
d = permn(my_list)
all_combinations = as.data.frame(matrix(unlist(d), ncol = 120)) %>%
setNames(paste0("col", 1:120))
all_combinations[,1:5]
col1 col2 col3 col4 col5
1 Red Red Red Red Orange
2 Blue Blue Blue Orange Red
3 Green Green Orange Blue Blue
4 Yellow Orange Green Green Green
5 Orange Yellow Yellow Yellow Yellow
My Question:
Suppose I wanted to filter this list by the following conditions:
I then tried to filter the above data based on these 3 conditons:
# attempt to write first condition
cond_1 <- all_combinations[which(all_combinations[1,]== "Red" || all_combinations[2,] == "Red"), ]
#not sure how to write the second condition
# attempt to write the third condition
cond_3 <- data_frame_version[which(data_frame_version[5,] !== "Yellow" ), ]
# if everything worked, an "anti join" style statement could be written to remove "cond_1, cond_2, cond_3" from the original data?
But this is not working at all - the first and third condition return a data frame only containing 4 rows for all the columns.
Can someone please show me how to correctly filter "all_combinations" using the above 3 filters?
Note:
The following code can transpose the original data:
library(data.table)
tpose = transpose(all_combinations)
df = tpose
#group every 5 rows by the same id to identify unique combinations
bloc_len <- 5
df$bloc <-
rep(seq(1, 1 + nrow(df) %/% bloc_len), each = bloc_len, length.out = nrow(df))
head(df)
V1 V2 V3 V4 V5 bloc
1 Red Blue Green Yellow Orange 1
2 Red Blue Green Orange Yellow 1
3 Red Blue Orange Green Yellow 1
4 Red Orange Blue Green Yellow 1
5 Orange Red Blue Green Yellow 1
6 Orange Red Blue Yellow Green 2
You can do:
library(tidyverse)
tpose %>%
mutate(blue_delete = case_when(V1 == "Blue" & V2 == "Green" ~ TRUE,
V1 == "Blue" & V3 == "Green" ~ TRUE,
V2 == "Blue" & V3 == "Green" ~ TRUE,
V3 == "Blue" & V4 == "Green" ~ TRUE,
V4 == "Blue" & V5 == "Green" ~ TRUE,
TRUE ~ FALSE)) %>%
filter(V3 != "Red" & V4 != "Red" & V5 != "Red",
V5 != "Yellow",
blue_delete == FALSE) %>%
select(-blue_delete)