I have data frame as :
df <- data.frame( date =seq(from = as.Date("2000-01-01"),
to = as.Date("2005-01-01"),'month'))
df <- df %>% mutate(cumsum = seq(1, length.out = length(date)))
I want to create a new column, which is the sum of the value in cumsum and every 12th value (one year back).
EDIT: I like both your answers! Actually I just found a problem for the solution for me (sorry my explanation was not quite clear.) Your approach gives me the sum of the value now and one year befor. But I do have seveal years and would need the cumsum of all overervation in previous years (so sum(x, lag(x,12), lag(x,24), lag (x,36)). I tried smth. like (rep(lag(cumsu, 12), nrow(df)/12). May you can help. Thanks!
The literal approach is to use lag
, and if you are assured of perfectly-spaced data, then @Jamie's answer is the most direct and simplest approach.
However, if there is a chance that you don't have all intermediate months, this could lag incorrectly. One way to guard against this is to self-join with the previous date.
df2 <- df[-20,] # just to impose some missingness
library(lubridate) # %m+%
df2 %>%
mutate(
# this is the more direct route, but with missingness it glitches
rolling_12 = cumsum + lag(cumsum, n = 12),
lastyear = date %m+% years(-1)
) %>%
left_join(df2, by = c("lastyear" = "date"), suffix = c("", "_12")) %>%
mutate(cumsum_12 = cumsum + cumsum_12) %>%
select(-lastyear)
# date cumsum rolling_12 cumsum_12
# 1 2000-01-01 1 NA NA
# 2 2000-02-01 2 NA NA
# 3 2000-03-01 3 NA NA
# 4 2000-04-01 4 NA NA
# 5 2000-05-01 5 NA NA
# 6 2000-06-01 6 NA NA
# 7 2000-07-01 7 NA NA
# 8 2000-08-01 8 NA NA
# 9 2000-09-01 9 NA NA
# 10 2000-10-01 10 NA NA
# 11 2000-11-01 11 NA NA
# 12 2000-12-01 12 NA NA
# 13 2001-01-01 13 14 14
# 14 2001-02-01 14 16 16
# 15 2001-03-01 15 18 18
# 16 2001-04-01 16 20 20
# 17 2001-05-01 17 22 22
# 18 2001-06-01 18 24 24
# 19 2001-07-01 19 26 26
# 20 2001-09-01 21 29 30 <-- this is where rolling_12 goes wrong
# 21 2001-10-01 22 31 32
# 22 2001-11-01 23 33 34
# 23 2001-12-01 24 35 36
# 24 2002-01-01 25 37 38
# 25 2002-02-01 26 39 40
# 26 2002-03-01 27 41 42
# 27 2002-04-01 28 43 44
# 28 2002-05-01 29 45 46
# 29 2002-06-01 30 47 48
# 30 2002-07-01 31 49 50
# 31 2002-08-01 32 51 NA
# 32 2002-09-01 33 54 54
# 33 2002-10-01 34 56 56
# 34 2002-11-01 35 58 58
# 35 2002-12-01 36 60 60
# 36 2003-01-01 37 62 62
# 37 2003-02-01 38 64 64
# 38 2003-03-01 39 66 66
# 39 2003-04-01 40 68 68
# 40 2003-05-01 41 70 70
# 41 2003-06-01 42 72 72
# 42 2003-07-01 43 74 74
# 43 2003-08-01 44 76 76
# 44 2003-09-01 45 78 78
# 45 2003-10-01 46 80 80
# 46 2003-11-01 47 82 82
# 47 2003-12-01 48 84 84
# 48 2004-01-01 49 86 86
# 49 2004-02-01 50 88 88
# 50 2004-03-01 51 90 90
# 51 2004-04-01 52 92 92
# 52 2004-05-01 53 94 94
# 53 2004-06-01 54 96 96
# 54 2004-07-01 55 98 98
# 55 2004-08-01 56 100 100
# 56 2004-09-01 57 102 102
# 57 2004-10-01 58 104 104
# 58 2004-11-01 59 106 106
# 59 2004-12-01 60 108 108
# 60 2005-01-01 61 110 110