I have a dataset that begins in year 1988 and ends in year 2020. I want to create averages for certain time intervals. For example, 5 years: 1988-1992, 1993-1997 and so on. But I want a new column with these averages.
For example, suppose I have this:
anos <- 1988:2020
valores <- c(15, 18, 20, NA, 25, 27, 28, NA, 32, 35, 36, 38, 40,
42, 45, 46, NA, 50, 52, 55, 56, 58, 60, NA, 65, 66,
68, 70, 72, 75, 76, 78, 80)
dataset <- data.frame(Ano = anos, Valor = valores)
I want to have this:
anos <- 1988:2020
valores <- c(15, 18, 20, NA, 25, 27, 28, NA, 32, 35, 36, 38, 40,
42, 45, 46, NA, 50, 52, 55, 56, 58, 60, NA, 65, 66,
68, 70, 72, 75, 76, 78, 80)
medias <- c(19.5, 19.5, 19.5, 19.5, 19.5,
30.5, 30.5, 30.5, 30.5, 30.5,
40.2, 40.2, 40.2, 40.2, 40.2,
50.75, 50.75, 50.75, 50.75, 50.75,
59.75, 59.75, 59.75, 59.75, 59.75,
70.2, 70.2, 70.2, 70.2, 70.2,
78, 78, 78)
dataset <- data.frame(Ano = anos, Valor = valores, Medias = medias)
There are five year averages for each time interval and the mean value repeats itself until the next five years. The last mean is an average of 3 values since the overall period is not a multiple of five.
EDIT: also, I will need to use group_by()
.
Here's a dplyr
version :
library(dplyr)
n <- 5
dataset %>%
arrange(Ano) %>%
group_by(Group = floor((Ano - first(Ano))/n)) %>%
mutate(Medias = mean(Valor, na.rm = TRUE)) %>%
data.frame()
which returns the following -
# Ano Valor Group Medias
#1 1988 15 0 19.50
#2 1989 18 0 19.50
#3 1990 20 0 19.50
#4 1991 NA 0 19.50
#5 1992 25 0 19.50
#6 1993 27 1 30.50
#7 1994 28 1 30.50
#8 1995 NA 1 30.50
#9 1996 32 1 30.50
#10 1997 35 1 30.50
#11 1998 36 2 40.20
#12 1999 38 2 40.20
#13 2000 40 2 40.20
#14 2001 42 2 40.20
#15 2002 45 2 40.20
#16 2003 46 3 50.75
#17 2004 NA 3 50.75
#18 2005 50 3 50.75
#19 2006 52 3 50.75
#20 2007 55 3 50.75
#21 2008 56 4 59.75
#22 2009 58 4 59.75
#23 2010 60 4 59.75
#24 2011 NA 4 59.75
#25 2012 65 4 59.75
#26 2013 66 5 70.20
#27 2014 68 5 70.20
#28 2015 70 5 70.20
#29 2016 72 5 70.20
#30 2017 75 5 70.20
#31 2018 76 6 78.00
#32 2019 78 6 78.00
#33 2020 80 6 78.00
In the example that you have shared you have a single entry for each year and data for all the year is present however, this will also work when :