Let's assume I have two lists in R, a long one and a shorter one, for example:
list1 = list(571,572,573,574,561,562,563,564,595,570,571,573)
list2 = list(c(571,564,565,600))
Please note that numbers are not ascending, they can appear more than once and they do not have the same distance to each other.
Now I would like to check for each element of list1 if it is also an element of list2. If that is the case I would like the output to be TRUE, if not FALSE. So, basically I want to get a list with the length of list1 containing TRUE and FALSE.
I don't know how to get on with this problem...Maybe with a for loop or lapply? In reality, the lists are much longer, about 30,000 rows for list1 and 1000 for list2..
Do you have any suggestions?
You could use the lapply function to check whether each element of list1 belongs to list2 iteratively:
lapply(list1, function(x){x %in% list2[[1]]})
This will return the output as a list. If you want it to return a vector instead, you can use sapply:
sapply(list1, function(x){x %in% list2[[1]]})
For every element in list1, this will still go though all elements in list2, so it can be slow. The faster alternative is to use hashing to store elements of list2 in a hash table and then do membership test in constant time. One way to do this in R is to use r2r package:
library(r2r)
m <- hashset()
for (i in list2[[1]]){insert(m, i)}
lapply(list1, function(x){has_key(m, x)})