Search code examples
rlistdataframemergecbind

Merge elements of dataframe in list based on partial match of names


I have a list:

lst <- list(a1=dfa1, a2=dfa2, b1=dfb1, b2=dfb2)

dfa1 <- data.frame(x=c(1:5), y=c(2, 5, 7, 9, 10))
dfa2 <- data.frame(x=c(1:6), y=c(3, 8, 1, 2, 4, 13))
dfb1 <- data.frame(x=c(1:4), y=c(7, 9, 3, 2))
dfb2 <- data.frame(x=c(1:7), y=c(9, 3, 5, 1, 7, 9, 11))

Base on the partial element match 'a' and 'b', I want column bind the dataframem and the new list should look like below:

new_list
$a
  x  y1 y2
1 1  2  3
2 2  5  8
3 3  7  1
4 4  9  2
5 5 10  4

$b
  x y1 y2
1 1 7  9
2 2 9  3
3 3 3  5
4 4 2  1

Solution

  • Here is a method with lapply and Reduce. lapply iterates through the letters "a" and "b" and applies Reduce to the list elements whose names contain the current letter. Reduce applies the merge function to the two data.frames, merging by the variable "x" and adding the desired suffixes with the given argument. Thanks to zx8754's suggestion, I added seq_along(grep(let, names(lst))) to allow the final names of the variables to increase by the number of group members.

    myList <- lapply(c("a", "b"), function(let)
                                  setNames(Reduce(function(x, y) merge(x, y, by="x"),
                                                  lst[grep(let, names(lst))]),
                                         c("x", paste0("y", seq_along(grep(let, names(lst)))))))
    [[1]]
      x y1 y2
    1 1  2  3
    2 2  5  8
    3 3  7  1
    4 4  9  2
    5 5 10  4
    
    [[2]]
      x y1 y2
    1 1  7  9
    2 2  9  3
    3 3  3  5
    4 4  2  1
    

    To add names to the list it is probably easiest to do this afterward,

    names(myList) <- c("a", "b")
    

    You could also start with the vector

    myVec <- c("a", "b")

    and then use it in the lapply and in the names line.