I need to update a subset of a table based on information in another table:
main_dt = data.table(ID = 1:3, flag=c(TRUE,TRUE,FALSE), treatment_id = c(1,2,1), total_cost = 0)
costs_dt = data.table(treatment_id = 1:3, treatment_cost = c(3,5,8))
main_dt
ID flag treatment_id total_cost
<int> <lgcl> <num> <num>
1: 1 TRUE 1 0
2: 2 TRUE 2 0
3: 3 FALSE 1 0
cost_dt
treatment_id treatment_cost
<int> <num>
1: 1 3
2: 2 5
3: 3 8
I would like to end up with:
> main_dt
ID flag treatment_id total_cost
<int> <lgcl> <num> <num>
1: 1 TRUE 1 3
2: 2 TRUE 2 5
3: 3 FALSE 1 0
I've been trying to do something like:
main_dt[flag==TRUE,][costs_dt, total_cost := total_cost + treatment_cost, on="treatment_ID", nomatch=NULL]
But this is getting an error because it has both a := assignment and a nomatch. If I remove the nomatch it still doesn't seem to work. Obviously I could join without the flag==TRUE, but then the row I don't want updated will be. Also, if I don't do an inner join I end up with a row with a bunch of NAs for treatment_id = 3.
I feel like this should be totally trivial but I am struggling to figure it out...
You could use fifelse
.
> library(data.table)
> main_dt[costs_dt, on = "treatment_id", total_cost := fifelse(flag, total_cost + treatment_cost, total_cost)]
> main_dt
ID flag treatment_id total_cost
<int> <lgcl> <num> <num>
1: 1 TRUE 1 3
2: 2 TRUE 2 5
3: 3 FALSE 1 0