Search code examples
rdataframejoinmergeleft-join

Merge two data frames if a list of string matches and list the unmatch string as NA R


I have two data frames and I want to assign or paste the string of second data frame in the same row if the string of Drug_Name and Drugz matches.

NOTE: If I do it by left_join or merge merge(df1, df2, all.x = TRUE, by.x = "Drug_Names", by.y = "Drugz"), it returns me the exact same dataframe that is similar to DF1.

Below is the example of with required DF

DF1

>     Pat_ID       Date      Code      Drug_Names
>     AB1     2010-12-09     1.1.1     Alpha
>     AB1     2010-12-15     1.1.1     Alpha
>     AB1     2010-12-15     1.1.1     Beta
>     Ax2     2010-12-09     1.1.1     Beta
>     Ax2     2010-12-17     1.1.1     Beta
>     Aq3     2011-02-09     1.1.1     Gamma
>     Aq3     2011-04-25     1.1.1     Gamma
>     Aw4     2011-04-25     1.1.1     Tango

DF2

Codez      Drugz
 1.1.1     Alpha
 1.1.3     Gamma

Required DF3

Pat_ID       Date      Code    Drug_Names    Drugz     Codez 
>     AB1     2010-12-09     1.1.1     Alpha      Alpha     1.1.1
>     AB1     2010-12-15     1.1.1     Alpha      Alpha     1.1.1
>     AB1     2010-12-15     1.1.1     Beta        NA         NA
>     Ax2     2010-12-09     1.1.1     Beta        NA         NA
>     Ax2     2010-12-17     1.1.1     Beta        NA         NA
>     Aq3     2011-02-09     1.1.1     Gamma      Gamma      1.1.3
>     Aq3     2011-04-25     1.1.1     Gamma      Gamma      1.1.3
>     Aw4     2011-04-25     1.1.1     Tango       NA         NA

Solution

  • You can use match.

    cbind(DF1, DF2[match(DF1$Drug_Names, DF2$Drugz),2:1])
    #     Pat_ID       Date  Code Drug_Names Drugz Codez
    #1       AB1 2010-12-09 1.1.1      Alpha Alpha 1.1.1
    #1.1     AB1 2010-12-15 1.1.1      Alpha Alpha 1.1.1
    #NA      AB1 2010-12-15 1.1.1       Beta  <NA>  <NA>
    #NA.1    Ax2 2010-12-09 1.1.1       Beta  <NA>  <NA>
    #NA.2    Ax2 2010-12-17 1.1.1       Beta  <NA>  <NA>
    #2       Aq3 2011-02-09 1.1.1      Gamma Gamma 1.1.3
    #2.1     Aq3 2011-04-25 1.1.1      Gamma Gamma 1.1.3
    #NA.3    Aw4 2011-04-25 1.1.1      Tango  <NA>  <NA>
    

    Or in case using merge add the column Drug_Names to DF2.

    merge(DF1, cbind(Drug_Names = DF2$Drugz, DF2), all.x = TRUE)
    #  Drug_Names Pat_ID       Date  Code Codez Drugz
    #1      Alpha    AB1 2010-12-09 1.1.1 1.1.1 Alpha
    #2      Alpha    AB1 2010-12-15 1.1.1 1.1.1 Alpha
    #3       Beta    AB1 2010-12-15 1.1.1  <NA>  <NA>
    #4       Beta    Ax2 2010-12-09 1.1.1  <NA>  <NA>
    #5       Beta    Ax2 2010-12-17 1.1.1  <NA>  <NA>
    #6      Gamma    Aq3 2011-02-09 1.1.1 1.1.3 Gamma
    #7      Gamma    Aq3 2011-04-25 1.1.1 1.1.3 Gamma
    #8      Tango    Aw4 2011-04-25 1.1.1  <NA>  <NA>
    

    Data

    DF1 <- read.table(header=TRUE, text="Pat_ID       Date      Code      Drug_Names
         AB1     2010-12-09     1.1.1     Alpha
         AB1     2010-12-15     1.1.1     Alpha
         AB1     2010-12-15     1.1.1     Beta
         Ax2     2010-12-09     1.1.1     Beta
         Ax2     2010-12-17     1.1.1     Beta
         Aq3     2011-02-09     1.1.1     Gamma
         Aq3     2011-04-25     1.1.1     Gamma
         Aw4     2011-04-25     1.1.1     Tango")
    
    DF2 <- read.table(header=TRUE, text="Codez      Drugz
     1.1.1     Alpha
     1.1.3     Gamma")