Search code examples
rdataframedplyr

Bind two columns of two dataframes into a single dataframe, filling with NA when the values are not equivalent


Imagine I have these two dataframes:

df1 <- data.frame(column_df1 = c("A", "B", "C", "D"))
df2 <- data.frame(column_df2 = c("A", "B", "C", "E"))

I want to merge them so I get a dataframe that has the following structure:

df3  <- data.frame(column_df1 = c("A", "B", "C", "D", NA),
                   column_df2 = c("A", "B", "C", NA, "E"))

Solution

  • Create an id column, then merge:

    merge(cbind(id = df1$column_df1, df1),
          cbind(id = df2$column_df2, df2),
          by = "id", all = TRUE)
    #   id column_df1 column_df2
    # 1  A          A          A
    # 2  B          B          B
    # 3  C          C          C
    # 4  D          D       <NA>
    # 5  E       <NA>          E