Search code examples
rif-statementdplyrdata-manipulation

Use filter multiple times for average calculation


I need to use the filter function (or maybe some other alternatives) multiple times to calculate the average based on the conditions specified.

Here is the dataset:

df <- data.frame(id = c(1,2,3,4,5,6,7), 
                 cond = c("Y", "Y", "N", "Y", "N", "Y", "N"), score = c(3,4,5,2,1,2,9))

I need to calculate the average separately for cond=Y and cond=N and later append this average column to the original dataset like this:

  id cond score  average
1  1    Y     3   2.75
2  2    Y     4   2.75
3  3    N     5   5
4  4    Y     2   2.75
5  5    N     1   5
6  6    Y     2   2.75
7  7    N     9   5

Solution

  • We may do

    library(data.table)
    setDT(df)[, average := mean(score), by = cond]
    

    -output

    > df
          id   cond score average
       <num> <char> <num>   <num>
    1:     1      Y     3    2.75
    2:     2      Y     4    2.75
    3:     3      N     5    5.00
    4:     4      Y     2    2.75
    5:     5      N     1    5.00
    6:     6      Y     2    2.75
    7:     7      N     9    5.00
    

    Or with collapse

    library(collapse)
    df$average <- fmean(df$score, df$cond, TRA = 1)