Search code examples
rmergebind

What type of joining should I use


I have two databases with different numbers of columns. All columns of the second database are included in the second database. The patients in the two databases are also different. I need to merge the two databases. The function merge (or _join of dplyr) will not work in principle since I have to overlay the databases. The binding (rowbind) should not also works cause I have different columns. What is the simple way to do it?

mydata<-data.frame(
  ID=c(1,1,1,2,2),B=rep("b",5),C=rep("c",5),D=rep("d",5)
)

mydata2<-data.frame(ID=c(3,4),B=c("b2","b2"),C=c("c2","c2"))

The expected dataset is this below:

  ID  B  C    D
1  1  b  c    d
2  1  b  c    d
3  1  b  c    d
4  2  b  c    d
5  2  b  c    d
6  3 b2 c2 <NA>
7  4 b2 c2 <NA>

Solution

  • A mere merge should suffice

    merge( mydata, mydata2, all=T )
      ID  B  C    D
    1  1  b  c    d
    2  1  b  c    d
    3  1  b  c    d
    4  2  b  c    d
    5  2  b  c    d
    6  3 b2 c2 <NA>
    7  4 b2 c2 <NA>