Search code examples
rbind

Bind dataframes in a list two by two (or by name) - R


Lets say I have this list of dataframes:

  DF1_A<- data.frame (first_column  = c("A", "B","C"),
                    second_column = c(5, 5, 5),
                    third_column = c(1, 1, 1)
)

DF1_B <- data.frame (first_column  = c("A", "B","E"),
                     second_column = c(1, 1, 5),
                     third_column = c(1, 1, 1)
)

DF2_A <- data.frame (first_column  = c("E", "F","G"),
                     second_column = c(1, 1, 5),
                     third_column = c(1, 1, 1)
)

DF2_B <- data.frame (first_column  = c("K", "L","B"),
                     second_column = c(1, 1, 5),
                     third_column = c(1, 1, 1)
)

mylist <- list(DF1_A, DF1_B, DF2_A, DF2_B)
names(mylist) = c("DF1_A", "DF1_B", "DF2_A", "DF2_B")


mylist =  lapply(mylist, function(x){
  x[, "first_column"] <- as.character(x[, "first_column"])
  x
})

I want to bind them by their name (All DF1, All DF2 etc), or, objectively, two by two in this ordered named list. Keeping the "named list structure" of the list is important to keep track (for example, DF1_A and DF1_B = DF1 or something similiar in the names(mylist))

There are some rows that have duplicated values, and I want to keep them (which will introduce some duplicated characters such as first_column, value A)

I have tried finding any clues here on stack overflow, but most people want to bind dataframes irrespective of their names or orders.

Final result would look something like this:

mylist
DF1
DF2

DF1
first_column    second_column   third_column
A               1               1
A               5               1
B               1               1
B               5               1
C               5               1
E               5               1

Solution

  • Do you mean something like this?

    lapply(
      split(mylist, gsub("_.*", "", names(mylist))),
      function(v) `row.names<-`((out <- do.call(rbind, v))[do.call(order, out), ], NULL)
    )
    

    which gives

    $DF1
      first_column second_column third_column
    1            A             1            1
    2            A             5            1
    3            B             1            1
    4            B             5            1
    5            C             5            1
    6            E             5            1
    
    $DF2
      first_column second_column third_column
    1            B             5            1
    2            E             1            1
    3            F             1            1
    4            G             5            1
    5            K             1            1
    6            L             1            1