Search code examples
rrowmean

R: Row means by group and year


I am trying to get an annual average of a column by group. I am a bit stumped with how to proceed.

Say my data looks like this:

Year Group Full Month
2022 Group 1 1 January
2022 Group 2 5 January
2022 Group 3 6 January
2023 Group 1 4 December
2023 Group 2 3 December
2023 Group 3 4 December

So the Month column is January to December, just shortened for simplicity. I would like to create another column, called "Annual average" which is the average of column "Full" for each year. The tricky part is I need it by Group as well. The end result is that the "Annual Average" column will have an average by year and by group such that for every year and group, the cell entry will be the same number (e.g., if the average for 2022 for Group 1 is 10, then the cell entries will be 10 for each row for Group 1 in 2022).

Any help is much appreciated!

So far, my attempts have only generated a single row mean, not differentiated by year or group.


Solution

  • If you want to keep the original data structure and only add new column, you can use group_by() and then mutate() like this;

    library(tidyverse)
    
    df <- data.frame(Year = c(2022L, 2022L, 2022L, 2023L, 2023L, 2023L),
               Group = c("Group 1", "Group 2", "Group 3", "Group 1", "Group 2", "Group 3"),
               Full = c(1, 5, 6, 4, 3, 4),
               Month = c("January", "January", "January", "December", "December", "December"))
    
    df |> group_by(Year, Group) |> 
            mutate(Annual_Average = mean(Full))
    #> # A tibble: 6 × 5
    #> # Groups:   Year, Group [6]
    #>    Year Group    Full Month    Annual_Average
    #>   <int> <chr>   <dbl> <chr>             <dbl>
    #> 1  2022 Group 1     1 January               1
    #> 2  2022 Group 2     5 January               5
    #> 3  2022 Group 3     6 January               6
    #> 4  2023 Group 1     4 December              4
    #> 5  2023 Group 2     3 December              3
    #> 6  2023 Group 3     4 December              4
    

    Created on 2024-04-10 with reprex v2.1.0

    NB: don't forget to ungroup()