Search code examples
rpanelcumulative-sum

Penalized cumulative sum in r


I need to calculate a penalized cumulative sum.

Individuals "A", "B" and "C" were supposed to get tested every other year. Every time they get tested, they accumulate 1 point. However, when they miss a test, their cumulative score gets deducted in 1.

I have the following code:

data.frame(year = rep(1990:1995, 3), person.id = c(rep("A", 6), rep("B", 6), rep("C", 6)),   needs.testing = rep(c("Yes", "No"), 9), test.compliance = c(c(1,0,1,0,1,0), c(1,0,1,0,0,0), c(1,0,0,0,0,0)), penalized.compliance.cum.sum = c(c(1,1,2,2,3,3), c(1,1,2,2,1,1), c(1,1,0,0,-1,-1)))

...which gives the following:

  year person.id needs.testing test.compliance penalized.compliance.cum.sum
1  1990         A           Yes               1                            1
2  1991         A            No               0                            1
3  1992         A           Yes               1                            2
4  1993         A            No               0                            2
5  1994         A           Yes               1                            3
6  1995         A            No               0                            3
7  1990         B           Yes               1                            1
8  1991         B            No               0                            1
9  1992         B           Yes               1                            2
10 1993         B            No               0                            2
11 1994         B           Yes               0                            1
12 1995         B            No               0                            1
13 1990         C           Yes               1                            1
14 1991         C            No               0                            1
15 1992         C           Yes               0                            0
16 1993         C            No               0                            0
17 1994         C           Yes               0                           -1
18 1995         C            No               0                           -1

As it is evident, "A" fully complied. "B" somewhat complied (in year 1994 he's supposed to get tested, but he missed the test, and consequently his cumulative sum gets deducted from 2 to 1). Finally, "C" complies just once (in year 1990, and every time she needs to get tested, she misses the test).

What I need is some code to get the "penalized.compliance.cum.sum" variable.

Please note:

  1. Tests are every other year.
  2. The "penalized.compliance.cum.sum" variable keeps adding the previous score.
  3. But starts deducting only if the individual misses the test on the testing year (denoted in the "needs.testing" variable).
  • For instance, individual "C" complies in year 1990. In 1991 she doesn't need to get tested, and hence keeps her score of 1. Then, she misses the 1992 test, and 1 is subtracted from her cumulative score, getting a score of 0 in 1992. Then she keeps missing test getting a -1 at the end of the study.
  • Also, I need to assign different penalties (i.e. different numbers). In this example, it's just 1. However, I need to be able to penalize using other numbers such as 0.5, 0.1, and others.

Thanks!


Solution

  • Using case_when

    library(dplyr)
    df1 %>%
       group_by(person.id) %>%
       mutate(res = cumsum(case_when(needs.testing == "Yes" ~ 1- 2 *(test.compliance < 1), TRUE ~ 0)))