Search code examples
rlistapplyrbindcbind

c-/r-bind from a list of data tables w/o recycling and w/ using list names as part of col names


there many post which shed some light on individual parts of my problem but I can't puzzle it together and need help. I'm trying to bind many data tables with varying row-count but identical column-count and -names from a list of data tables but:

  1. I do not want to recycle rows from the tables with fewer rows
  2. I do want to use the list item names to create individual columns from the "border" columns in the resulting table
  3. I do not want to use the list item names and create a single column from all "value" columns in the resulting table

Here is s small example and various partial solutions taken from other posts:

# Create dummy list of data frames    
df1 <- data.frame(border=c(1,2,3), value=c(4,5,6))
df2 <- data.frame(border=as.factor(c("A","B")), value=c(3,5.7))
df_lst <- list(df1, df2)
names(df_lst) <- c("df1","df2")


t1 <- as.data.table(df_lst)
t2 <- DataVisualizations::CombineCols(df1, df2)
t3 <- do.call(CombineCols,as.data.table(df_lst))
t4 <- rbindlist(df_lst)

Can anyone help me puzzle this together such that I can get something like this using the list (an not the individual data frames which I do not have in the real life example) as sole input ...

   df1 df2 value
1:   1 NA    4
2:   2 NA    5
3:   3 NA    6
4:  NA  A   3.0
5:  NA  B   5.7

Any hints appreciated!

Cheers, Mark


Solution

  • You can use mapply to rename the first column with the name of the data.frame, and then use dplyr::bind_rows (or if you prefer, data.table::rbindfill(fill = TRUE)):

    df_lst |> 
      mapply(FUN = \(x, y){names(x)[1] <- y; x}, 
             x = _, y = names(df_lst), SIMPLIFY = FALSE) |> 
      dplyr::bind_rows()
    
    #   df1 value  df2
    # 1   1   4.0 <NA>
    # 2   2   5.0 <NA>
    # 3   3   6.0 <NA>
    # 4  NA   3.0    A
    # 5  NA   5.7    B