Search code examples
rdataframemergerbindcbind

Merging data frames of different lengths without unique keys


I am trying to merge two data frames of different lengths without using a unique key.

For example:

Name <- c("Steve","Peter")
Age <- c(10,20)

df1 <- data.frame(Name,Age)

> df1
   Name Age
1 Steve  10
2 Peter  20

Name <-c("Jason","Nelson")
School <-c("xyz","abc")

df2 <- data.frame(Name,School)

> df2
    Name School
1  Jason    xyz
2 Nelson    abc

I want to join these two tables so that I have all columns and have NA cells for rows that didn't have that column originally. It should look something like this:

    Name Age School
1  Steve  10   <NA>
2  Peter  20   <NA>
3  Jason  NA    xyz
4 Nelson  NA    abc

thank you in advance!


Solution

  • dplyr::bind_rows(df1,df2)
    # Warning in bind_rows_(x, .id) :
    #   Unequal factor levels: coercing to character
    # Warning in bind_rows_(x, .id) :
    #   binding character and factor vector, coercing into character vector
    # Warning in bind_rows_(x, .id) :
    #   binding character and factor vector, coercing into character vector
    #     Name Age School
    # 1  Steve  10   <NA>
    # 2  Peter  20   <NA>
    # 3  Jason  NA    xyz
    # 4 Nelson  NA    abc
    

    You can alleviate some of this by pre-assigning unrecognized columns, which also works well with base R:

    df2 <- cbind(df2, df1[NA,setdiff(names(df1), names(df2)),drop=FALSE])
    df1 <- cbind(df1, df2[NA,setdiff(names(df2), names(df1)),drop=FALSE])
    df1
    #       Name Age School
    # NA   Steve  10   <NA>
    # NA.1 Peter  20   <NA>
    df2
    #        Name School Age
    # NA    Jason    xyz  NA
    # NA.1 Nelson    abc  NA
    
    # ensure we use the same column order for both frames
    nms <- names(df1)
    rbind(df1[,nms], df2[,nms])
    #         Name Age School
    # NA     Steve  10   <NA>
    # NA.1   Peter  20   <NA>
    # NA1    Jason  NA    xyz
    # NA.11 Nelson  NA    abc