I have a dataset where each row is a monthly observation of patients, monitoring whether they tested positive to a disease (status).
I know which month (i.e. row number for each ID - TimeToDx) they were diagnosed, and what I would like to do is have a binary indicator that switches from 0 to 1, starting from the observation month indicated in TimeToDx.
Basically I need to replicate 0 by the TimeToDx - 1, then for the remainder of the rows replicate 1 for each ID.
Here is some example data - without the status indicator filled:
ID TimeToDx Status
10425 2
10425 2
10425 2
10425 2
10667 3
10667 3
10667 3
10667 3
10667 3
10686 2
10686 2
10686 2
10686 2
10686 2
17096 5
17096 5
17096 5
17096 5
17096 5
Here is what I would like to see:
ID TimeToDx Status
10425 2 0
10425 2 1
10425 2 1
10425 2 1
10667 3 0
10667 3 0
10667 3 1
10667 3 1
10667 3 1
10686 2 0
10686 2 1
10686 2 1
10686 2 1
10686 2 1
17096 5 0
17096 5 0
17096 5 0
17096 5 0
17096 5 1
Any help would be much appreciated.
df <- structure(list(ID = c(10425L, 10425L, 10425L, 10425L, 10667L,
10667L, 10667L, 10667L, 10667L, 10686L, 10686L, 10686L, 10686L,
10686L, 17096L, 17096L, 17096L, 17096L, 17096L),
TimeToDx = c(2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 2L, 2L,
2L, 2L, 2L, 5L, 5L, 5L, 5L, 5L)),
class = "data.frame", row.names = c(NA, -19L))
library(tidyverse)
df %>%
group_by(ID) %>%
mutate(Status = +((row_number() %/% TimeToDx) != 0)) %>%
ungroup()
#> # A tibble: 19 x 3
#> ID TimeToDx id_row
#> <int> <int> <int>
#> 1 10425 2 0
#> 2 10425 2 1
#> 3 10425 2 1
#> 4 10425 2 1
#> 5 10667 3 0
#> 6 10667 3 0
#> 7 10667 3 1
#> 8 10667 3 1
#> 9 10667 3 1
#> 10 10686 2 0
#> 11 10686 2 1
#> 12 10686 2 1
#> 13 10686 2 1
#> 14 10686 2 1
#> 15 17096 5 0
#> 16 17096 5 0
#> 17 17096 5 0
#> 18 17096 5 0
#> 19 17096 5 1
Created on 2021-10-21 by the reprex package (v2.0.1)
data.table
using @Jon Spring's solution
library(data.table)
setDT(df)[, Status := +(rowid(ID) >= TimeToDx)][]
#> ID TimeToDx Status
#> 1: 10425 2 0
#> 2: 10425 2 1
#> 3: 10425 2 1
#> 4: 10425 2 1
#> 5: 10667 3 0
#> 6: 10667 3 0
#> 7: 10667 3 1
#> 8: 10667 3 1
#> 9: 10667 3 1
#> 10: 10686 2 0
#> 11: 10686 2 1
#> 12: 10686 2 1
#> 13: 10686 2 1
#> 14: 10686 2 1
#> 15: 17096 5 0
#> 16: 17096 5 0
#> 17: 17096 5 0
#> 18: 17096 5 0
#> 19: 17096 5 1
Created on 2021-10-21 by the reprex package (v2.0.1)