Search code examples
rdata.tablesubtraction

Subtract data.table from another data.table when unique row ID and column names match


I would like to subtract dt2 from dt1 to generate the output shown in dt3. I would then like to subset dt3 so only rows that contain a negative value remain (like dt4).

dt1 <- data.table(
  UID=c("A001","A002","A003","B001","B002","B003","C001","C002","C003"),
  Var1=c(100, 200, 300, 400, 500,600,700,800,900),
  Var2=c(1000, 2000, 3000, 4000, 5000,6000,7000,8000,9000),
  Var3=c(10000, 20000, 30000, 40000, 50000,60000,70000,80000,90000),
  Var4=c(15000, 25000, 35000, 45000, 55000,65000,75000,85000,95000))

dt2 <- data.table(
  UID=c("A001","A003","B001","B003","C001","C003"),
  Var1=c(10, 20, 30, 40, 50,950),
  Var2=c(100, 2500, 300, 400, 500,600),
  Var3=c(1000, 2000, 3000, 4000, 5000,6000))

dt3 <- data.table(
  UID=c("A001","A002","A003","B001","B002","B003","C001","C002","C003"),
  Var1=c(90, 200, 280, 370, 500,560,650,800,-50),
  Var2=c(900, 2000, -500, 3700, 5000,5600,6500,8000,8400),
  Var3=c(9000, 20000, 28000, 37000, 50000,56000,65000,80000,84000),
  Var4=c(15000, 25000, 35000, 45000, 55000,65000,75000,85000,95000))

dt4 <- dt3[c(3,9),]

Solution

  • Another possible approach:

    DT <- copy(dt1)
    
    #subtracting
    cols <- setdiff(names(dt2), "UID")
    DT[dt2, on=.(UID), (cols) := .SD - mget(paste0("i.", cols)), .SDcols=cols]
    
    #subsetting
    DT[Reduce(`|`, lapply(DT, `<`, 0))]
    

    output:

        UID Var1 Var2  Var3  Var4
    1: C003  -50 8400 84000 95000