Search code examples
rlistsapply

R How to extract vector of integers from multi-list data structure


I am working in R, trying to extract a vector of numbers from a list data structure. Here is a small reproducible example that reflects the structure of a much larger dataset.

# Create a reproducible example multi-level list structure 
mylist <- list()
mylist$`1` <- c("barcodes","data")
mylist$`2` <- c("barcodes","data")
mylist$`3` <- c("barcodes","data")
mylist$`1`$barcodes <- c(1:50)
mylist$`2`$barcodes <- c(50:200)
mylist$`3`$barcodes <- c(1:200)

I am able to successfully generate the data I need here called numbers using the following command. However this requires that I hard code each dataset of interest which is not ideal.

numbers <- c(mylist$`1`$barcodes[1:5],
             mylist$`3`$barcodes[1:5])

#This does achieve the desired result 
#> numbers
#[1] 1 2 3 4 5 1 2 3 4 5

I am trying to do this in a high throughput way without hardcoding. Below is my attempt.

nums_of_interest <- c(1,3)
numbers <- c(gsub(" ", "", paste("mylist$'",nums_of_interest,"'$barcodes[1:5]")))

# This does not achieve the desired result
#> numbers
#[1] "mylist$'1'$barcodes[1:5]" "mylist$'3'$barcodes[1:5]"

I am struggling to identify a way to extract the numbers of interest 1 2 3 4 5 1 2 3 4 5 of interest in a high throughput way.


Solution

  • Here is one option with lapply. The first line extracts "barcodes" from your list, it returns a list.

    lst <- lapply(mylist, `[[`, "barcodes")
    

    Now we almost do the same again when we extract the first 5 entries from a subset of lst, i.e. from lst[nums_of_interest].

    nums_of_interest <- c(1, 3)
    (numbers <- lapply(lst[nums_of_interest], `[`, 1:5))
    #$`1`
    #[1] 1 2 3 4 5
    #
    #$`3`
    #[1] 1 2 3 4 5
    

    Since numbers is a list but you want a vector in return use unlist (and unname) to get desired output.

    unname(unlist(numbers))
    # [1] 1 2 3 4 5 1 2 3 4 5
    

    Or in one line with credits to @avid_useR

    unname(unlist(lapply(mylist[nums_of_interest], function(x) x[['barcodes']][1:5])))