Suppose I have some grouping information in the form of a grouping vector:
group = c(1,1,2,2,3,3,3)
So this is saying I have three groups: two groups of size 2 and one group of size 3. Now suppose I have a vector (I've added random numbers)
x = c(1.5, 3.1, 5.4, -4.5, 2.2, 4.4, 1.1)
Is there an efficient way in R to loop over this vector and applying certain functions within the group?
For example, summation within each group, using a for loop would be:
sums = rep(0,3)
for (i in 1:3){
grp_ids = which(group == i)
sums[i] = sum(x[grp_ids])
}
Is there an easier way to do this?
You can use group_by
from {dplyr}:
library(dplyr)
group = c(1, 1, 2, 2, 3, 3, 3)
x = c(1.5, 3.1, 5.4, -4.5, 2.2, 4.4, 1.1)
df <- data.frame(group, x)
result <- df %>%
group_by(group) %>%
summarize(sums = sum(x))
> print(result)
# A tibble: 3 × 2
group sums
<dbl> <dbl>
1 1 4.6
2 2 0.9
3 3 7.7