Search code examples
rfiltercombinations

Subset based on conditions on multiple columns


I have a combination table based on a "colour list":

library(tidyr)

#give available colour combo
colour= c("BK", "CS", "DB","DG","LB","LG","MV","OR","RD","WT","YL","ID","EMPTY")

combo = expand.grid(colour, colour,colour, colour)
#function "colnames" to change column names
colnames(combo) <- c('A','B','C','D')

However I'm not interested in all possible combinations. I want to be able to filter this table by removing the combination that I don't want. For example removing rows that have more than 1 "ID" and rows that have more than 1 "EMPTY". (removing rows like: "ID ID OR LB" or "EMPTY ID LB EMPTY".

I also want to remove rows that have no "ID" at all or no "EMPTY" at all.


Solution

  • Using your code, you can do the following:

    # Indices of rows with more than 1 "EMPTY"
      rem1 <- which(apply(combo,1,function(x) sum(x=="EMPTY"))>=2)
    # Indices of rows with more than one "ID"
      rem2 <- which(apply(combo,1,function(x) sum(x=="ID"))>=2)
    
    # Subtract rows in "rem1" and "rem2" to get combo2
      combo2 <- combo[-c(rem1, rem2),]
    

    Elements rem1 and rem2 can be combined into a unique object; nevertheless, I separated them so that you can see which indices fulfil each condition.