I'd like to create a custom function to avoid an extreme rate, in my case, lower until 2%, the opposite (increase the rate) is not a problem.
For this, I have a dataset:
# Dataset
area <- c("A","A","A","A","B","B","B","B","C","C","C","C")
dates <- c("05-01-2024","10-01-2024","18-01-2024","01-02-2024",
"05-01-2024","10-01-2024","18-01-2024","01-02-2024","05-01-2024",
"10-01-2024","18-01-2024", "01-02-2024")
response <- c(63,68,82,85,77,71,60,50,79,78,77,76)
my.ds <- data.frame(area = area, dates = dates, response = response)
I need a custom function that, if the last date is lower more than 2%, then the new value decreases only 2%, if the value increases or lower until 2%, there is nothing to do. I try to something like:
mutate(my.ds, response = ifelse(c(0, abs(diff(response))) < abs(response * .02), lag(response), response), .by = area)
but doesn't work.
My desirable new.my.ds
, must be:
# my.ds.new
# area dates response
#1 A 2024-01-05 63
#2 A 2024-01-10 68
#3 A 2024-01-18 82
#4 A 2024-02-01 85
#5 B 2024-01-05 77
#6 B 2024-01-10 76
#7 B 2024-01-18 75
#8 B 2024-02-01 74
#9 C 2024-01-05 79
#10 C 2024-01-10 78
#11 C 2024-01-18 77
#12 C 2024-02-01 76
Please, any help with it?
try this:
Please note, that ceiling()
rounds up the response value. If you don't want that, then you should change it to if (change < -threshold) capped_response[i] <- round(capped_response[i - 1]- capped_response[i - 1]* threshold,0)
library(dplyr)
area <- c("A","A","A","A","B","B","B","B","C","C","C","C")
dates <- c("05-01-2024","10-01-2024","18-01-2024","01-02-2024",
"05-01-2024","10-01-2024","18-01-2024","01-02-2024","05-01-2024",
"10-01-2024","18-01-2024", "01-02-2024")
response <- c(63,68,82,85,77,71,60,50,79,78,77,76)
my.ds <- data.frame(area = area, dates = dates, response = response)
# Custom function to cap changes in response
cap_response <- function(response, threshold = 0.02) {
capped_response <- response
for (i in 2:length(response)) {
change <- (capped_response[i] - capped_response[i - 1])/capped_response[i - 1]
if (change < -threshold) capped_response[i] <- ceiling(capped_response[i - 1]- capped_response[i - 1]* threshold)
# if (change < -threshold) capped_response[i] <- round(capped_response[i - 1]- capped_response[i - 1]* threshold,0)
}
return(capped_response)
}
# Apply the function using mutate and group_by
new.my.ds <- my.ds %>%
group_by(area) %>%
mutate(response = cap_response(response)) %>%
ungroup()
which finally prints your required result:
# A tibble: 12 × 3
area dates response
<chr> <chr> <dbl>
1 A 05-01-2024 63
2 A 10-01-2024 68
3 A 18-01-2024 82
4 A 01-02-2024 85
5 B 05-01-2024 77
6 B 10-01-2024 76
7 B 18-01-2024 75
8 B 01-02-2024 74
9 C 05-01-2024 79
10 C 10-01-2024 78
11 C 18-01-2024 77
12 C 01-02-2024 76