Search code examples
rdplyrpurrrnested-lists

Making .id argument show the names of objects after binding a nested list /R


I am working with a large nested list of tibbles. A previous post already helped me out, but I am stuck at the last step of forming a usable dataframe out of a large nested list. In this dataframe should be an 'id' column that shows the name a tibble has within the list. I tried bind.rows(.id='id') but it discards the names and gives it a numeric index. How can I avoid this? Here is a minimized version of my problem: (I am not really sure if the example is precise enough, as I was not able to name each list element, but I hope the idea comes across)

a<-tibble (a=numeric(7),
           b=letters[7:1],
           c=integer(length=1))

b<-tibble (a=integer(length=1),
           b=as.numeric(8),
           c=letters[7:1])


c<- tibble(.rows = 2)

A<-list(list(a,b,c))
B<-list(A,list(a,b,c)) 
C<-list(A,B)

riddle<-list(A,B,C)

Following is the code that I am running to get my original data in format, but you will see that the id column only gets numeric indexes, for the example, as for my original data

rrapply(riddle, condition = function(x) all(dim(x)>0),  
        f =  function(x) 
        {
          # change to unique column names
          names(x) <- make.unique(names(x))
          x %>%  
            # convert all columns to character if there
            # are mismatch in column types in any list elements
            mutate(across(everything(), as.character))
        },      classes = "data.frame", how= "flatten") %>% 
  # bind the flattened list of data.frame/tibbles to single dataset
  bind_rows(.id="id") %>%
  # do the column type conversion 
  type.convert(as.is = TRUE) 

Pretending that my example would have names for the 12 values of id - How and which command would I need to implement to get the names of the objects as values for the .id column?


Solution

  • If the list have names, then we may be able to extract and create 'id' with the names of the list

    library(rrapply)
    library(dplyr)
    library(stringr)
    A <-list(list(a,b,c))
    B <- list(A = A, list(a, b, c))
    C <- list(A=A, B = B)
    riddle <- list(A = A, B = B, C = C)
    

    -testing

    out <- rrapply(riddle, condition = function(x) all(dim(x)>0),  
            f =  function(x, .xparents) 
            {
              # change to unique column names
              names(x) <- make.unique(names(x))
              x %>%  
            mutate(id =  str_c(setdiff(.xparents, ""), 
                 collapse = "_"), .before = 1 ) %>%
                # convert all columns to character if there
                # are mismatch in column types in any list elements
                mutate(across(everything(), as.character)) 
            },      classes = "data.frame", how= "flatten") %>%
        bind_rows() %>% 
        type.convert(as.is = TRUE) 
    

    -output

    > out
    # A tibble: 84 × 4
       id        a b     c    
       <chr> <int> <chr> <chr>
     1 A_1       0 g     0    
     2 A_1       0 f     0    
     3 A_1       0 e     0    
     4 A_1       0 d     0    
     5 A_1       0 c     0    
     6 A_1       0 b     0    
     7 A_1       0 a     0    
     8 A_1_2     0 8     g    
     9 A_1_2     0 8     f    
    10 A_1_2     0 8     e    
    # … with 74 more rows