Search code examples
rdplyrconditional-statementscalculated-columnsdata-transform

Create a new Timepoint column based on a condition in R


I have a dataset that has a subject ID and different timepoints for each subject. I have the data filtered on few filters and want to create a new timepoint based on the existing timepoints for the subjects. The new point should be the difference between the First visit for the subject and the current visit. How can I do this in R ?

Sample data with expected result

ID Original_timepoint New_timepoint
A 1 1
A 6 6
A 18 18
A 36 36
B 24 1
B 48 24

Solution

  • We can standardize by taking the diff and append 1 grouped by 'ID'. From the input data, some of the IDs are already standardized. So, we may use an if/else condition to skip those IDs

    library(dplyr)
    df1 %>%
        group_by(ID) %>%
        mutate(New_timepoint = if(first(Original_timepoint)  > 1) 
              c(1, diff(Original_timepoint)) else Original_timepoint) %>%
        ungroup
    

    -ouptut

    # A tibble: 6 x 3
      ID    Original_timepoint New_timepoint
      <chr>              <int>         <dbl>
    1 A                      1             1
    2 A                      6             6
    3 A                     18            18
    4 A                     36            36
    5 B                     24             1
    6 B                     48            24
    

    data

    df1 <- structure(list(ID = c("A", "A", "A", "A", "B", "B"), 
    Original_timepoint = c(1L, 
    6L, 18L, 36L, 24L, 48L), New_timepoint = c(1L, 6L, 18L, 36L, 
    1L, 24L)), class = "data.frame", row.names = c(NA, -6L))