Search code examples
rdatatabletidyverse

Leave-one out means by group in R


Imagine a table of individuals over time in different firms. I'm trying to compute for every individual the mean wage of their co-workers (ie the mean wage in their firm at time t excluding them). I have a working code using data.table in R which works, but I'm wondering whether there is a better, more efficient of doing this:

foo <- data.table(
  i = rep(1:6, each = 2), 
  t = rep(1:2, 6),
  f = rep(1:2, each = 6),
  w = 1:12
)

foo[, x := mean(foo[t == .BY$t & f == foo[i == .BY$i & t == .BY$t]$f & i != .BY$i]$w), by = .(i, t)]

Solution

  • maybe this:

    foo[, V1 := sapply(i, function(x) mean(w[-match(x,i)])) , by=.(f, t)]
    #    i t f  w V1
    # 1: 1 1 1  1  4
    # 2: 1 2 1  2  5
    # 3: 2 1 1  3  3
    # 4: 2 2 1  4  4
    # 5: 3 1 1  5  2
    # 6: 3 2 1  6  3
    # 7: 4 1 2  7 10
    # 8: 4 2 2  8 11
    # 9: 5 1 2  9  9
    # 10: 5 2 2 10 10
    # 11: 6 1 2 11  8
    # 12: 6 2 2 12  9