I have the following problem. I have a data.table and a subset of columns M
. I have vector x
defined on M
.
library(data.table)
data <- matrix(c(0,0,NA,1,0,1,NA,1,0,0,1,0,1,1,NA,NA,1,0,0,1,0,0,1,1,1,0,0,1,NA,0,1,1,0,1,1,1), byrow = T, ncol = 6, dimnames = LETTERS[1:6])
dt <- data.table(data)
dt
% A B C D E F
% 1: 0 0 NA 1 0 1
% 2: NA 1 0 0 1 0
% 3: 1 1 NA NA 1 0
% 4: 0 1 0 0 1 1
% 5: 1 0 0 1 NA 0
% 6: 1 1 0 1 1 1
M = LETTERS[2:5]
x <- dt[2,..M]
x
% B C D E
% 1: 1 0 0 1
I would like to remove all rows from dt
with marginal on M
equal to x
. I.e. rows no. 2 and 4. Both M
and x
change during the program. The result for the given M
and x
will be:
A B C D E F
1: 0 0 NA 1 0 1
2: 1 1 NA NA 1 0
3: 1 0 0 1 NA 0
4: 1 1 0 1 1 1
data.table anti-join
dt[!x, on = M] # also works: dt[!dt[2], on = M]
# A B C D E F
# 1: 0 0 NA 1 0 1
# 2: 1 1 NA NA 1 0
# 3: 1 0 0 1 NA 0
# 4: 1 1 0 1 1 1
Base R
eq2 <- Reduce('&', lapply(dt[, ..M], function(x) x == x[2]))
dt[-which(eq2),]
# A B C D E F
# 1: 0 0 NA 1 0 1
# 2: 1 1 NA NA 1 0
# 3: 1 0 0 1 NA 0
# 4: 1 1 0 1 1 1