Search code examples

How to mutate a column for the number at risk (survival)

As a personal exercise, I was wondering how to generate a column for the number at risk, or the number of observations that have not yet experienced the event at time t.

Here's some sample data:

df <- tibble(
  event = c(1,1,1,0,0),
  time = c(10, 20, 30, 40, 50)

The desired output should look like:

# A tibble: 5 x 3
  event  time nrisk
  <dbl> <dbl> <dbl>
1     1    10     4
2     1    20     3
3     1    30     2
4     0    40     2
5     0    50     2


  • If every row is an individual you could subtract number of rows in the dataframe with cumulative sum of event.

    df$n_risk <- nrow(df) - cumsum(df$event)
    #  event  time n_risk
    #  <dbl> <dbl>  <dbl>
    #1     1    10      4
    #2     1    20      3
    #3     1    30      2
    #4     0    40      2
    #5     0    50      2