Search code examples
rjoindplyrmultiplication

Combine two tables and multiply combinations from first into second


I'd like to combine two tables based on matching conditions from the second table. However, for the fields that do not match I'd like to duplicate the records from the first table.

I'm using dplyr and have tried full_join but this just combines the two tables and does not duplicate the non-matching rows.

Table 1

K6  K7  SL  EY  Val.1
Aero    Gen Asia    Asia    1
Aero    Arp Asia    Asia    2
Aero    Pro Asia    Asia    8
Aero    Lgt Asia    Asia    6

Table 2

K6  SL  EY  Val.2
Aero    Asia    Asia    15
Aero    Asia    US  20

Table 3 i.e. Combinations of Table 1 and Table 2 (with duplicates from table 1 where EY does not match):

K6  K7  SL  EY  Val.1   Val.2
Aero    Gen Asia    Asia    1   15
Aero    Arp Asia    Asia    2   15
Aero    Pro Asia    Asia    8   15
Aero    Lgt Asia    Asia    6   15
Aero    Gen Asia    US  0   20
Aero    Arp Asia    US  0   20
Aero    Pro Asia    US  0   20
Aero    Lgt Asia    US  0   20

The K6, SL, EY combination of Aero, Asia, US in table 2 but not in table 1, therefore I'd like to duplicate the values in table 1 and but with EY set to US


Solution

  • To duplicate the table for values of EY that do not exist in table 1, I suggest joining to table 2 on just columns K6 and SL, not on column EY.

    table_1 <- data.frame(K6=rep("Aero",4), K7=c("Gen","Arp","Pro","Lgt"), SL=rep("Asia",4), EY=rep("Asia",4), Val.1=c(1,2,8,6), stringsAsFactors = FALSE)
    table_2 <- data.frame(K6=rep("Aero",2), SL=rep("Asia",2), EY=c("Asia","US"), Val.2=c(15,20), stringsAsFactors = FALSE)
    
    library(dplyr)
    table_3 <- left_join(table_2, table_1, by=c("K6","SL"))
    
        K6   SL EY.x Val.2  K7 EY.y Val.1
    1 Aero Asia Asia    15 Gen Asia     1
    2 Aero Asia Asia    15 Arp Asia     2
    3 Aero Asia Asia    15 Pro Asia     8
    4 Aero Asia Asia    15 Lgt Asia     6
    5 Aero Asia   US    20 Gen Asia     1
    6 Aero Asia   US    20 Arp Asia     2
    7 Aero Asia   US    20 Pro Asia     8
    8 Aero Asia   US    20 Lgt Asia     6
    

    Then, for all rows where EY.x not does not match EY.y, set Val.1 to 0.

    table_3 <- table_3 %>%
      mutate(Val.1 = ifelse(EY.x==EY.y, Val.1, 0)) %>%
      select(-EY.y, EY=EY.x)
    
        K6   SL   EY Val.2  K7 Val.1
    1 Aero Asia Asia    15 Gen     1
    2 Aero Asia Asia    15 Arp     2
    3 Aero Asia Asia    15 Pro     8
    4 Aero Asia Asia    15 Lgt     6
    5 Aero Asia   US    20 Gen     0
    6 Aero Asia   US    20 Arp     0
    7 Aero Asia   US    20 Pro     0
    8 Aero Asia   US    20 Lgt     0