Search code examples
rnested

full_join each list in a nested list, using lapply in only one braces (not a separate lapply or map2)


I have looked at past similar questions but nothing really like my question, so I begin:

enter image description here

this is the list as

This is obtained from a folder structure (below)

Folder structure

In each folder

Using the below code, I got to the read.csv part and naming the nested lists

data_dir <- "C:/Users/thepr/Documents/data/as"
num_fu <- 1 : 9
dirs <- paste0(data_dir, num_fu)

as_list <- lapply(dirs, function(x) {
  files <- list.files(x, pattern = "\\.csv$", full.names = TRUE)
  names(files) <- str_sub(basename(files), 1, 6)
  lapply(files, read.csv
    })

What I want to do is,

>as[[1]]
(a full_join dataframe of as[[1]][[1]], as[[1]][[2]], as[[1]][[3]],... as[[1]][[9]])
>as[[2]]
(a full_join dataframe of as[[2]][[1]], as[[2]][[2]], as[[2]][[3]],... as[[2]][[9]])
.
.
.
>as[[9]]
(a full_join dataframe of as[[9]][[1]], as[[9]][[2]], as[[9]][[3]],... as[[9]][[9]])

and the name of each joined list be

>names(as_list)
[[1]] as1
[[2]] as2
.
.
.
[[9]] as9

This nested list makes it difficult to use lapply. As suggested here, I want to use reduce(full_join, as_list[[x]]) but I can't figure out how to put it to R code.

data_dir <- "C:/Users/thepr/Documents/data/as"
num_fu <- 1 : 9
dirs <- paste0(data_dir, num_fu)

as_list <- lapply(dirs, function(x) {
  files <- list.files(x, pattern = "\\.csv$", full.names = TRUE)
  names(files) <- str_sub(basename(files), 1, 6)
  lapply(files, function(y) {
    read.csv(y)
    reduce(full_join, as_list[[x]])
    })
})

Error in x[[1]] : object of type 'closure' is not subsettable
Called from: reduce_init(.x, .init, left = left, error_call = .purrr_error_call)

OR

as_list <- lapply(as_list, function(x){
  reduce(full_join, as_list[[x]])
  })



To no avail. Thank you always.!


Solution

  • data_dir <- "C:/Users/thepr/Documents/data/as"
    num_fu <- 1 : 9
    dataframe <- "as"
    dirs <- paste0(data_dir, num_fu)
    as <- data.frame()
    
    as_list <- lapply(dirs, function(x) {
      files <- list.files(x, pattern = "\\.csv$", full.names = TRUE)
      names(files) <- str_sub(basename(files), 1, 6)
      Reduce(full_join, lapply(files, read.csv))
    })
    
    names(as_list) <- paste0(dataframe, num_fu)
    as <- Reduce(full_join, as_list)
    

    gives

    > as_list[[1]] %>% head() %>% dput()
    structure(list(RID = c("EPI22_039_181584", "EPI22_039_077150", 
    "EPI22_039_042243", "EPI22_039_115383", "EPI22_039_035585", "EPI22_039_070773"
    ), AS1_AREA = c(1L, 1L, 1L, 1L, 2L, 1L), AS1_EDATE1 = c(200210L, 
    200111L, 200108L, 200110L, 200201L, 200111L), AS1_SEX = c(1L, 
    2L, 1L, 1L, 1L, 1L), AS1_AGE = c(63L, 60L, 49L, 62L, 46L, 66L
    ), AS1_FAMNUM = c(2L, 1L, 2L, 2L, 4L, 2L), AS1_HAND = c(3L, 1L, 
    1L, 1L, 1L, 1L), AS1_SEAS = c(3L, 1L, 1L, 3L, 3L, 3L), AS1_MARRYA = c(2L, 
    2L, 2L, 1L, 2L, 2L), AS1_MARRYAETC = c("77777", "77777", "77777", 
    ....