Search code examples
rlistdplyrreduce

reduce list of multiple data.frames into one data.frame with unequal rows


I have multiple data.frames with two column, all of them share the first column. I get them in a list and want to combine them into one data.frame using dplyr::bind_cols

However this is not possible, as they have unequal rows. I don't have the chance to change the structure of the dataset. How am I able to join the data.frames using dplyr?

I tried bind_rows and full_join, but both don't work.

Example:

a <- data.frame(a = rep(1:9), col_a1 = "a")
b <- data.frame(a = rep(1:9), col_b = "b")
c <- data.frame(a = rep(1:8), col_c = "a")

data_list <- list(a, b, c)

data_all <- reduce(data_list, bind_cols)
# Error in `fn()`:
#   ! Can't recycle `..1` (size 9) to match `..2` (size 8).
# Run `rlang::last_trace()` to see where the error occurred.

data_all <- reduce(data_list, full_join(by = "a"))

#Wanted output:   
    data_all
    a col_a1 col_b col_c
    1 1      a     b     a
    2 2      a     b     a
    3 3      a     b     a
    4 4      a     b     a
    5 5      a     b     a
    6 6      a     b     a
    7 7      a     b     a
    8 8      a     b     a
    9 9      a     b  <NA>

I am happy for every advice. Note that in reality, I have hundreds of data.frames in the list at the beginning, so I cannot type in manually.

I also tried reduce(bind_rows), with the idea of pivot_wider afterwards, but then the column names are gone.

Thank you!


Solution

  • library(tidyverse)
    
    reduce(data_list, left_join)
    
    # A tibble: 9 × 4
          a col_a1 col_b col_c
      <int> <chr>  <chr> <chr>
    1     1 a      b     a    
    2     2 a      b     a    
    3     3 a      b     a    
    4     4 a      b     a    
    5     5 a      b     a    
    6     6 a      b     a    
    7     7 a      b     a    
    8     8 a      b     a    
    9     9 a      b     NA