Search code examples
rlistdataframevariableskey

How I can make a data frame with a list of sublist with different lengths in R?


I have a list of sublists in R. However, the lists have different lengths. I would like to create a data frame with this sublist list. However, the challenge is that the first item of each sublist must be repeated for the other items of the sublist. This first item is the key variable for the other sublist subitems. The list I have is this:

lista <- list(list("data pregão 16187465 1 27/08/2020 clear", 
                   "1-bovespa c vista itausa pn ed n1 100 9,67 967,00 d"),
              list("data pregão 17212976 1 10/09/2020 clear",
                   "1-bovespa v vista itausa pn ed n1 100 9,40 940,00 c"),
              list("data pregão 19759871 1 19/10/2020 clear",
                  c("1-bovespa c fracionario magaz luiza on eb nm # 1 25,76 25,76 d", "1-bovespa c fracionario magaz luiza on eb nm # 9 25,76 231,84 d", "1-bovespa c fracionario magaz luiza on eb nm 40 25,76 1.030,40 d", "1-bovespa c fracionario mrv on ed nm 40 18,14 725,60 d")))

Solution

  • Extract the first and second element separately, create a tibble by looping over the outer list with map and bind them together with suffix _dfr

    library(purrr)
    map_dfr(lista, ~ tibble(col1 = .x[[1]], col2 = .x[[2]]))
    

    -output

    # A tibble: 6 x 2
    #  col1                                    col2                                                            
    #  <chr>                                   <chr>                                                           
    #1 data pregão 16187465 1 27/08/2020 clear 1-bovespa c vista itausa pn ed n1 100 9,67 967,00 d             
    #2 data pregão 17212976 1 10/09/2020 clear 1-bovespa v vista itausa pn ed n1 100 9,40 940,00 c             
    #3 data pregão 19759871 1 19/10/2020 clear 1-bovespa c fracionario magaz luiza on eb nm # 1 25,76 25,76 d  
    #4 data pregão 19759871 1 19/10/2020 clear 1-bovespa c fracionario magaz luiza on eb nm # 9 25,76 231,84 d 
    #5 data pregão 19759871 1 19/10/2020 clear 1-bovespa c fracionario magaz luiza on eb nm 40 25,76 1.030,40 d
    #6 data pregão 19759871 1 19/10/2020 clear 1-bovespa c fracionario mrv on ed nm 40 18,14 725,60 d          
    

    Or may use bind_cols with map

    library(dplyr)
    map_dfr(lista,  bind_cols)
    

    Or using base R

    do.call(rbind, lapply(lista, function(x) 
            as.data.frame(x, col.names = c('col1', 'col2'))))