I am working with a large nested list of tibbles. A previous post already helped me out, but I am stuck at the last step of forming a usable dataframe out of a large nested list.
In this dataframe should be an 'id' column that shows the name a tibble has within the list. I tried bind.rows(.id='id')
but it discards the names and gives it a numeric index. How can I avoid this?
Here is a minimized version of my problem:
(I am not really sure if the example is precise enough, as I was not able to name each list element, but I hope the idea comes across)
a<-tibble (a=numeric(7),
b=letters[7:1],
c=integer(length=1))
b<-tibble (a=integer(length=1),
b=as.numeric(8),
c=letters[7:1])
c<- tibble(.rows = 2)
A<-list(list(a,b,c))
B<-list(A,list(a,b,c))
C<-list(A,B)
riddle<-list(A,B,C)
Following is the code that I am running to get my original data in format, but you will see that the id column only gets numeric indexes, for the example, as for my original data
rrapply(riddle, condition = function(x) all(dim(x)>0),
f = function(x)
{
# change to unique column names
names(x) <- make.unique(names(x))
x %>%
# convert all columns to character if there
# are mismatch in column types in any list elements
mutate(across(everything(), as.character))
}, classes = "data.frame", how= "flatten") %>%
# bind the flattened list of data.frame/tibbles to single dataset
bind_rows(.id="id") %>%
# do the column type conversion
type.convert(as.is = TRUE)
Pretending that my example would have names for the 12 values of id - How and which command would I need to implement to get the names of the objects as values for the .id column?
If the list
have names, then we may be able to extract and create 'id' with the names of the list
library(rrapply)
library(dplyr)
library(stringr)
A <-list(list(a,b,c))
B <- list(A = A, list(a, b, c))
C <- list(A=A, B = B)
riddle <- list(A = A, B = B, C = C)
-testing
out <- rrapply(riddle, condition = function(x) all(dim(x)>0),
f = function(x, .xparents)
{
# change to unique column names
names(x) <- make.unique(names(x))
x %>%
mutate(id = str_c(setdiff(.xparents, ""),
collapse = "_"), .before = 1 ) %>%
# convert all columns to character if there
# are mismatch in column types in any list elements
mutate(across(everything(), as.character))
}, classes = "data.frame", how= "flatten") %>%
bind_rows() %>%
type.convert(as.is = TRUE)
-output
> out
# A tibble: 84 × 4
id a b c
<chr> <int> <chr> <chr>
1 A_1 0 g 0
2 A_1 0 f 0
3 A_1 0 e 0
4 A_1 0 d 0
5 A_1 0 c 0
6 A_1 0 b 0
7 A_1 0 a 0
8 A_1_2 0 8 g
9 A_1_2 0 8 f
10 A_1_2 0 8 e
# … with 74 more rows