I am trying to merge two data frames of different lengths without using a unique key.
For example:
Name <- c("Steve","Peter")
Age <- c(10,20)
df1 <- data.frame(Name,Age)
> df1
Name Age
1 Steve 10
2 Peter 20
Name <-c("Jason","Nelson")
School <-c("xyz","abc")
df2 <- data.frame(Name,School)
> df2
Name School
1 Jason xyz
2 Nelson abc
I want to join these two tables so that I have all columns and have NA cells for rows that didn't have that column originally. It should look something like this:
Name Age School
1 Steve 10 <NA>
2 Peter 20 <NA>
3 Jason NA xyz
4 Nelson NA abc
thank you in advance!
dplyr::bind_rows(df1,df2)
# Warning in bind_rows_(x, .id) :
# Unequal factor levels: coercing to character
# Warning in bind_rows_(x, .id) :
# binding character and factor vector, coercing into character vector
# Warning in bind_rows_(x, .id) :
# binding character and factor vector, coercing into character vector
# Name Age School
# 1 Steve 10 <NA>
# 2 Peter 20 <NA>
# 3 Jason NA xyz
# 4 Nelson NA abc
You can alleviate some of this by pre-assigning unrecognized columns, which also works well with base R:
df2 <- cbind(df2, df1[NA,setdiff(names(df1), names(df2)),drop=FALSE])
df1 <- cbind(df1, df2[NA,setdiff(names(df2), names(df1)),drop=FALSE])
df1
# Name Age School
# NA Steve 10 <NA>
# NA.1 Peter 20 <NA>
df2
# Name School Age
# NA Jason xyz NA
# NA.1 Nelson abc NA
# ensure we use the same column order for both frames
nms <- names(df1)
rbind(df1[,nms], df2[,nms])
# Name Age School
# NA Steve 10 <NA>
# NA.1 Peter 20 <NA>
# NA1 Jason NA xyz
# NA.11 Nelson NA abc