Search code examples
rdplyrapplysapply

Extract the column from the inner data frame within a list of lists of lists


I am trying to extract a column from a dataframe that is stored as an element inside a list of lists, there are a total of 100 models where 10 different data models are stored in an individual list. This individual list then contains the dataframe. (The depth of the top nested list is up three)

Furthermore, I want to save these column values and place them in a separate dataframe which I'm trying with rbind.

I am able to get all the values from the final dataframe but not specifically the column of interest and am not sure which values belong to which model, this is done as seen below:

with apply

Here I am able to obtain all the values but am not able to use dplyr to conduct further selection

importance <- sapply(unlist(myNestedList, recursive = FALSE), `[[`, 9)

with base r and dplyr

meanDF <- data.frame() 

#Loop through ModelsList
for (i in seq_along(top_list)) {
  for (j in seq_along(top_list[[i]])) {
    for (x in seq_along(top_list[[i]][[j]])) {
      test <- as.data.frame(top_list[[i]][[j]]$importance) %>%
        select(Mean) %>%
        arrange(desc(Mean)) %>%
        slice(1:50) %>%
        rownames_to_column(var = "Bus") %>%
        mutate(RF = j) %>%
        rename(value = Mean)
      
      #Append to Dataframe
  meanDF <- rbind(meanDF, test)
    }
  }
}

dataset example

n1 <- 100 ; m <- 200 ; reps <- 6
df1 <- as.data.frame(cbind(matrix(seq_len(m), n1, m/n1), 
                           replicate(reps, sample(c(0, 1), n1, replace = TRUE))))

n2 <- 100 ; m <- 200 ; reps <- 4
df2 <- as.data.frame(cbind(matrix(seq_len(m), n2, m/n2), 
                           replicate(reps, sample(c(0, 1), n2, replace = TRUE))))


# create a list and create 3 lists 
# inside this list
top_list = list(list1 = list(list2 = list(df1)),
                list3 = list(list4 = list(df2)))




Solution

  • We may use

    library(purrr)
    map_depth(top_list, 3, pluck, 2)
    

    Or use recursive function

    library(rrapply)
    rrapply(top_list, classes = "data.frame", f= \(x) x[[2]], how = "flatten")
    

    -output

    $`1`
      [1] 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143
     [44] 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186
     [87] 187 188 189 190 191 192 193 194 195 196 197 198 199 200
    
    $`1`
      [1] 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143
     [44] 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186
     [87] 187 188 189 190 191 192 193 194 195 196 197 198 199 200