Search code examples
rdplyr

How to lookup and sum multiple columns in R


Suppose I have 2 dataframes structured as such:

GROUPS:

P1      P2      P3      P4
123     213     312     231
345     123     213     567

INDIVIDUAL_RESULTS:

ID      SCORE
123     23
213     12
312     11
213     19
345     10
567     22

I want to add a column to the GROUPS which is a sum of each of their individual results:

P1      P2      P3      P4      SCORE
123     213     312     231     65

I've tried using various merge techniques, but have really just created a mess. I feel like there's a simple solution I just don't know about, would really appreciate some guidance!


Solution

  • d1=read.table(text="
    P1      P2      P3      P4
    123     213     312     231
    345     123     213     567",h=T)
    
    d2=read.table(text="
    ID      SCORE
    123     23
    213     12
    312     11
    231     19
    345     10
    567     22",h=T)
    

    I will be using the apply and match functions. Apply will apply the match function to each row of d1, match will find the matching values from the row of d1 and d2$ID (their indices) and then take the values in d2$SCORE at those indices. In the end we sum them up.

    d1$SCORE=apply(d1,1,function(x){
      sum(d2$SCORE[match(x,d2$ID)])
    })
    

    and the result

       P1  P2  P3  P4 SCORE
    1 123 213 312 231    65
    2 345 123 213 567    67