I have a list of lists in R. Each sublist in the list of lists contains multiple elements. These sublists do not necessarily all have the same length. All sublists have a specific header name. Like this:
#create list of lists
vector1 = c("apple","banana","cherry")
vector2 = c("banana","date","fig")
vector3 = c("fig","jackfruit","mango","plum")
listoflists = list(vector1 , vector2, vector3)
names(listoflists) = c("listA", "listB", "listC")
The list of lists looks like this:
listoflists
$listA
[1] "apple" "banana" "cherry"
$listB
[1] "banana" "date" "fig"
$listC
[1] "fig" "jackfruit" "mango" "plum"
Next, I have a vector that contains elements that can also be found within the sublists. Like this:
wanted = c("apple","banana","fig")
wanted
[1] "apple" "banana" "fig"
For each element in the vector wanted I want to extract the header names of each sublist in the list of lists that contains this particular element. For the here presented example the output should look something like this:
#desired output
apple listA
banana listA listB
fig listB listC
I thought about putting this into a for loop to obtain something like this:
output_list = list()
for (i in wanted){
output = EXTRACT LIST HEADER WHEN i IS PRESENT IN SUBLIST
output_list[[i]] = output
}
However, it is not clear whether I can, and if yes how to, loop over the list of lists to extract header names of only those sublists that contain the element in the vector wanted. I looked into using the unlist
function but that did not seem to be useful for this problem. I looked on stackoverflow, as well as other forums but could not find any question outlining a similar problem. It would thus be really helpful if someone can point me into the right direction to solve this issue.
Thanks already!
There are multiple ways to get the output.
1) An option is to loop over the 'listoflists', subset the vector
based on the 'wanted' values, stack
it to a two column data.frame
and split
into a list
again by 'values'
with(stack(lapply(listoflists, function(x)
x[x %in% wanted])), split(as.character(ind), values))
#$apple
#[1] "listA"
#$banana
#[1] "listA" "listB"
#$fig
#[1] "listB" "listC"
2) or we can stack
first to a two column 'data.frame', then subset
the rows, and split
with(subset(stack(listoflists), values %in% wanted),
split(as.character(ind), values))
#$apple
#[1] "listA"
#$banana
#[1] "listA" "listB"
#$fig
#[1] "listB" "listC"
3)) Or another option is to loop over the 'wanted' and get the names
of the 'listoflists' based on a match
setNames(lapply(wanted, function(x)
names(which(sapply(listoflists, function(y) x %in% y)))), wanted)
#$apple
#[1] "listA"
#$banana
#[1] "listA" "listB"
#$fig
#[1] "listB" "listC"