Consider this dataframe:
library(dplyr)
df <- data.frame(id = c(1,1,1,2,2), x = 1:5)
id x
1 1 1
2 1 2
3 1 3
4 2 4
5 2 5
To get average x values per id, i use
df |> group_by(id) |> dplyr::summarise(group_mean = mean(x))
# A tibble: 2 × 2
id group_mean
<dbl> <dbl>
1 1 2
2 2 4.5
I need to calculate the average of these group means, which is (2 + 4.5) / 2 = 3.25. However, this code fails:
df |> group_by(id) |> dplyr::summarise(group_mean = mean(x)) |> mean(group_mean)
[1] NA
Warning message:
In mean.default(dplyr::summarise(group_by(df, id), group_mean = mean(x)), :
argument is not numeric or logical: returning NA
Any suggestions?
EDIT: This question is not similar to enter link description here as mentioned by @shizzle because i'm looking for the unbalanced mean of means, i.e., a second stage of aggregation and not for the first stage of calculating averages.
You could just pull
the column with values and calculate the mean
after like this:
library(dplyr)
df |>
group_by(id) |>
dplyr::summarise(group_mean = mean(x)) |>
pull(group_mean) |>
mean()
#> [1] 3.25
Created on 2024-07-31 with reprex v2.1.0