Search code examples
rdplyrmutateacross

R dplyr mutate across columns - pick first value in several columns that is greater than another


I have a dataset that looks like this but with much more t_i columns:

df <- data.frame("id" = c(1,2,3,4), "date" = c(4,8,2,5), 
             "t_1" = c(NA,9,NA,4), "t_2" = c(NA,10,NA,5),
             "t_3" = c(NA,12,NA,6),"t_4" = c(NA,16,NA,8))

# id date t_1 t_2 t_3 t_4
# 1    4  NA  NA  NA  NA
# 2    8   9  10  12  NA
# 3    2  NA  NA  NA  NA
# 4    5   4   5   6   8

In my data, for each individuals "id", I have a date in the column "date" and I have other dates in columns "t_1,...,t_x" that are in chronological order. When one t_i is NA then all subsequent t_i+1,...,t_x are NA (non available). In the example data above, I chose to replace the dates by numeric values to keep it simple.

What I want: create a column to my data, let's say "entry" that takes the first value of the columns t_1, ...,t_x for which "date"<"t_i" else NA.

In my example, the result would be:

df <- data.frame("id" = c(1,2,3,4), "date" = c(4,8,2,5), 
             "entry"=c(NA,9,NA,6),"t_1" = c(NA,9,NA,4), 
             "t_2" = c(NA,10,NA,5),"t_3" = c(NA,12,NA,6),
             "t_4" = c(NA,NA,NA,8))

# id date entry t_1 t_2 t_3 t_4
# 1    4    NA  NA  NA  NA  NA
# 2    8     9   9  10  12  NA
# 3    2    NA  NA  NA  NA  NA
# 4    5     6   4   5   6   8
    

I have the feeling that I could use across or if_any or if_all in dplyr mutate but I can't find the right code! Your help would be more than welcome!

Thank you


Solution

  • You can first reshape your data into a "long" format, then use some indexing to get the correct "t_i" values for entry by id, then transform back to your "wide" format.

    library(tidyverse)
    
    df %>% 
      pivot_longer(c(-1, -2)) %>% 
      mutate(entry = min(value[date < value]), .by = id) %>% 
      pivot_wider()
    
    # A tibble: 4 × 7
         id  date entry   t_1   t_2   t_3   t_4
      <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
    1     1     4    NA    NA    NA    NA    NA
    2     2     8     9     9    10    12    16
    3     3     2    NA    NA    NA    NA    NA
    4     4     5     6     4     5     6     8