Search code examples
rjoinmergedata.table

R data.table updating a table based on an inner join with another table


I need to update a subset of a table based on information in another table:

main_dt = data.table(ID = 1:3, flag=c(TRUE,TRUE,FALSE), treatment_id = c(1,2,1), total_cost = 0)

costs_dt = data.table(treatment_id = 1:3, treatment_cost = c(3,5,8))

main_dt

      ID   flag treatment_id total_cost
   <int> <lgcl>        <num>      <num>
1:     1   TRUE            1          0
2:     2   TRUE            2          0
3:     3  FALSE            1          0

cost_dt

   treatment_id treatment_cost
          <int>          <num>
1:            1              3
2:            2              5
3:            3              8

I would like to end up with:

> main_dt
      ID   flag treatment_id total_cost
   <int> <lgcl>        <num>      <num>
1:     1   TRUE            1          3
2:     2   TRUE            2          5
3:     3  FALSE            1          0

I've been trying to do something like:

main_dt[flag==TRUE,][costs_dt, total_cost := total_cost + treatment_cost, on="treatment_ID", nomatch=NULL]

But this is getting an error because it has both a := assignment and a nomatch. If I remove the nomatch it still doesn't seem to work. Obviously I could join without the flag==TRUE, but then the row I don't want updated will be. Also, if I don't do an inner join I end up with a row with a bunch of NAs for treatment_id = 3.

I feel like this should be totally trivial but I am struggling to figure it out...


Solution

  • You could use fifelse.

    > library(data.table)
    > main_dt[costs_dt, on = "treatment_id", total_cost := fifelse(flag, total_cost + treatment_cost, total_cost)]
    > main_dt
          ID   flag treatment_id total_cost
       <int> <lgcl>        <num>      <num>
    1:     1   TRUE            1          3
    2:     2   TRUE            2          5
    3:     3  FALSE            1          0