I have a data like:
library(dplyr)
set.seed(123)
data <- data.frame(
date = rep(1:3, each=3),
grupo = rep(c("A", "B", "C"), 3),
x = runif(9, 10, 50),
y = runif(9, 20, 100),
z = runif(9, 5, 30)
)
# Convert group into factor
data$grupo <- as.factor(data$grupo)
# Order data
data <- data %>% arrange(grupo, date)
I want the first observation of each row to be (1+original_value/100). However, starting from the second row I want the formula to be modified to first_modified_value*(1 + second_observed_value/100), the third new row will do: second_modified_value*(1 + third_observed_value/100).
I have generated the following formula, but I am not able to state that lag() should take the modified value, and not the lag(original value):
# Apply formula in each group for variables, x,y,z
dataaa <- data %>%
group_by(grupo) %>%
mutate(across(c(x, y, z), ~if_else(row_number() == 1, (1 + ./100), lag(.)*(1 + ./100))))
Any help?
The cumprod()
function would appear to be the answer here as you are multiplying each value by the previous result:
dataaa <- data %>%
group_by(grupo) %>%
mutate(across(c(x, y, z), ~cumprod(1 + . / 100)))
dataaa
# A tibble: 9 × 5
# Groups: grupo [3]
date grupo x y z
<int> <fct> <dbl> <dbl> <dbl>
1 1 A 1.22 1.57 1.13
2 2 A 1.77 2.73 1.38
3 3 A 2.32 5.24 1.68
4 1 B 1.42 1.97 1.29
5 2 B 2.09 3.26 1.56
6 3 B 3.04 4.55 1.91
7 1 C 1.26 1.56 1.27
8 2 C 1.41 2.00 1.65
9 3 C 1.87 2.47 1.96