Create a custom function to avoid extreme rate

I'd like to create a custom function to avoid an extreme rate, in my case, lower until 2%, the opposite (increase the rate) is not a problem.

For this, I have a dataset:

# Dataset
area <- c("A","A","A","A","B","B","B","B","C","C","C","C")
dates <- c("05-01-2024","10-01-2024","18-01-2024","01-02-2024",
"05-01-2024","10-01-2024","18-01-2024","01-02-2024","05-01-2024",
"10-01-2024","18-01-2024", "01-02-2024")
response <- c(63,68,82,85,77,71,60,50,79,78,77,76)
my.ds <- data.frame(area = area, dates = dates, response = response)

I need a custom function that, if the last date is lower more than 2%, then the new value decreases only 2%, if the value increases or lower until 2%, there is nothing to do. I try to something like:

mutate(my.ds, response = ifelse(c(0, abs(diff(response))) < abs(response * .02), lag(response),  response),  .by = area)

but doesn't work.

My desirable new.my.ds, must be:

# my.ds.new
#  area      dates response
#1    A 2024-01-05       63
#2    A 2024-01-10       68
#3    A 2024-01-18       82
#4    A 2024-02-01       85
#5    B 2024-01-05       77
#6    B 2024-01-10       76
#7    B 2024-01-18       75
#8    B 2024-02-01       74
#9    C 2024-01-05       79
#10   C 2024-01-10       78
#11   C 2024-01-18       77
#12   C 2024-02-01       76

Please, any help with it?

Solution

try this:

Please note, that ceiling() rounds up the response value. If you don't want that, then you should change it to if (change < -threshold) capped_response[i] <- round(capped_response[i - 1]- capped_response[i - 1]* threshold,0)

library(dplyr)
area <- c("A","A","A","A","B","B","B","B","C","C","C","C")
dates <- c("05-01-2024","10-01-2024","18-01-2024","01-02-2024",
           "05-01-2024","10-01-2024","18-01-2024","01-02-2024","05-01-2024",
           "10-01-2024","18-01-2024", "01-02-2024")
response <- c(63,68,82,85,77,71,60,50,79,78,77,76)
my.ds <- data.frame(area = area, dates = dates, response = response)

# Custom function to cap changes in response
cap_response <- function(response, threshold = 0.02) {
  capped_response <- response
  for (i in 2:length(response)) {
    change <- (capped_response[i] - capped_response[i - 1])/capped_response[i - 1]
    if (change < -threshold) capped_response[i] <- ceiling(capped_response[i - 1]- capped_response[i - 1]* threshold)
    # if (change < -threshold) capped_response[i] <- round(capped_response[i - 1]- capped_response[i - 1]* threshold,0)
  }
  return(capped_response)
}

# Apply the function using mutate and group_by
new.my.ds <- my.ds %>%
  group_by(area) %>%
  mutate(response = cap_response(response)) %>%
  ungroup()

which finally prints your required result:

# A tibble: 12 × 3
   area  dates      response
   <chr> <chr>         <dbl>
 1 A     05-01-2024       63
 2 A     10-01-2024       68
 3 A     18-01-2024       82
 4 A     01-02-2024       85
 5 B     05-01-2024       77
 6 B     10-01-2024       76
 7 B     18-01-2024       75
 8 B     01-02-2024       74
 9 C     05-01-2024       79
10 C     10-01-2024       78
11 C     18-01-2024       77
12 C     01-02-2024       76