Search code examples
rpurrr

pull all elements with specific name from a nested list


I have some archived Slack data that I am trying to get some of key message properties. I'd done this by stupidly flattening the entire list, getting a data.frame or tibble with lists nested in some cells. As this dataset gets bigger, I want to pick elements out of this list more smartly so that when this cache becomes big it doesn't take forever to create the data.frame or tibble with the elements I want.

Example where I am trying to pull everything named "type" below into a vector or flat list that I can pull in as a dataframe variable. I named the folder and message level for convenience. Anyone have model code that can help?

library(tidyverse)
    
l <- list(folder_1 = list(
  `msg_1-1` = list(type = "message",
               subtype = "channel_join",
               ts = "1585771048.000200",
               user = "UFUNNF8MA",
               text = "<@UFUNNF8MA> has joined the channel"),
  `msg_1-2` = list(type = "message",
                   subtype = "channel_purpose",
                   ts = "1585771049.000300",
                   user = "UNFUNQ8MA",
                   text = "<@UNFUNQ8MA> set the channel purpose: Talk about xyz")),
  folder_2 = list(
    `msg_2-1` = list(type = "message",
                  subtype = "channel_join",
                  ts = "1585771120.000200",
                  user = "UQKUNF8MA",
                  text = "<@UQKUNF8MA> has joined the channel")) 
)

# gets a specific element
print(l[[1]][[1]][["type"]])

# tried to get all elements named "type", but am not at the right list level to do so
print(purrr::map(l, "type"))

Solution

  • Related to those provided by @Duck & @Abdessabour Mtk yesterday, purrr has a function map_depth() that will let you get a named attribute if you know its name and how deep it is in the hierarchy. REALLY useful when crawling this big nested lists, and is a simpler solution to the nested map() calls above.

    purrr::map_depth(l, 2, "type")