I am using R to pull a set of rows from a data frame. Many of the rows are pulled repeatedly. The rows are chosen using two criteria. Unfortunately the results are yielding a unique set of rows matching the criteria. I shall demonstrate...
Given the data.frame:
a = data.frame(array(c(1,2,3,1,4,5,6,2,7,8,9,4), c(4,3)))
Which will look like:
X1 X2 X3
1 1 4 7
2 2 5 8
3 3 6 9
4 1 2 4
Lets suppose I wish to call upon a
with two sets of criteria defined by arrays:
criteriaX1 = c(1,2,1,1,2)
criteriaX2 = c(4,5,4,2,5)
Then I would use this command:
a[ a$X1 %in% criteriaX1 & a$X2 %in% criteriaX2, ]
Hoping to get 5 rows like so (look @ criteriaX1 for the key, and read down X1. Should make sense if it didn't already):
X1 X2 X3
1 1 4 7
2 2 5 8
3 1 4 7
4 1 2 4
5 2 5 8
But instead I just got this:
X1 X2 X3
1 1 5 9
I'm guessing it has something to do with %in%
defining Set Membership, but I'm not sure how to get around this without an obnoxious loop. All assistance is appreciated.
Thanks.
You could use a data.table equi-join:
library(data.table)
a <- data.table(a)
b <- data.table(X1 = criteriaX1, X2 = criteriaX2)
setkey(a, X1, X2)
a[b]
# X1 X2 X3
# 1: 1 4 7
# 2: 2 5 8
# 3: 1 4 7
# 4: 1 4 7
# 5: 2 5 8