Mutate case_when nested conditional labelling

I have a data frame with multiple observations per ID. Each ID has performed many tests and for each test, has been classified into a tertile rank (top, mid, bottom) based on their test performance. This rank can vary at each time point.

Dummy df looks like this:

      df <- tibble(ID = c(1,1,1,2,2,2,3,3,3,4,4,4),
        time = c(1,2,3,1,2,3,1,2,3,1,2,3),
        test1_rank <- c("top", "top", "top", "top", "mid", "bottom", "bottom", "bottom", "bottom", "top", "bottom", "bottom"),
test2_rank <- c("bottom", "bottom", "bottom", "top", "mid", "bottom", "top", "top", "top", "top", "bottom", "bottom"))
    
     ID  time `test1_rank <- ...` `test2_rank <- ...`
   <dbl> <dbl> <chr>               <chr>              
 1     1     1 top                 bottom             
 2     1     2 top                 bottom             
 3     1     3 top                 bottom             
 4     2     1 top                 top                
 5     2     2 mid                 mid                
 6     2     3 bottom              bottom             
 7     3     1 bottom              top                
 8     3     2 bottom              top                
 9     3     3 bottom              top                
10     4     1 top                 top                
11     4     2 bottom              bottom             
12     4     3 bottom              bottom

I want to create a new variable where I classify each individual based on how their rank changes with time for each test. Specifically, for each test:

If rank is the same through all three time points, they get a label "stable" (can be stable(top) or stable(bottom) based on whether the rank is "top" or "bottom" across all time points).
If the rank changes from top (time 1) to mid (time 2) to bottom (time 3), they get a label of "gradual".
If rank declines from top (time 1) to bottom (time 2 and time 3), they get a label of "rapid".
Any other combination gets the label "Other".

Desired data frame:

  df2 <- tibble(ID = c(1,1,1,2,2,2,3,3,3,4,4,4),
                  time = c(1,2,3,1,2,3,1,2,3,1,2,3),
                  test1_rank <- c("top", "top", "top", "top", "mid", "bottom", "bottom", "bottom", "bottom", "top", "bottom", "bottom"),
test2_rank <- c("bottom", "bottom", "bottom", "top", "mid", "bottom", "top", "top", "top", "top", "bottom", "bottom"),
                  test1_rankgroup <- c("stable(top)", "stable(top)", "stable(top)", "gradual", "gradual", "gradual", "stable(bottom)", "stable(bottom)", "stable(bottom)", "rapid", "rapid", "rapid"), 
test2_rankgroup <- c("stable(bottom)", "stable(bottom)", "stable(bottom)", "gradual", "gradual", "gradual", "stable(top)", "stable(top)", "stable(top)", "rapid", "rapid", "rapid"))

        ID  time `test1_rank <- ...` `test2_rank <- ...` `test1_rankgroup <- ...` test2_ra…¹
   <dbl> <dbl> <chr>               <chr>               <chr>                    <chr>     
 1     1     1 top                 bottom              stable(top)              stable(bo…
 2     1     2 top                 bottom              stable(top)              stable(bo…
 3     1     3 top                 bottom              stable(top)              stable(bo…
 4     2     1 top                 top                 gradual                  gradual   
 5     2     2 mid                 mid                 gradual                  gradual   
 6     2     3 bottom              bottom              gradual                  gradual   
 7     3     1 bottom              top                 stable(bottom)           stable(to…
 8     3     2 bottom              top                 stable(bottom)           stable(to…
 9     3     3 bottom              top                 stable(bottom)           stable(to…
10     4     1 top                 top                 rapid                    rapid     
11     4     2 bottom              bottom              rapid                    rapid     
12     4     3 bottom              bottom              rapid                    rapid

What is the easiest way to solve this in dplyr using mutate and case_when?

Solution

You can use mutate with across to apply your case_when to both of your ranked columns.

Your case_when can use n_distinct to see if the same value is held across your 3 time points. You can also include specific logic for 1st, 2nd, and 3rd values for a labelled group value.

The .default argument can be used for "other". The .by argument will perform the mutate grouped by ID.

library(tidyverse)

df <- tibble(ID = c(1,1,1,2,2,2,3,3,3,4,4,4),
             time = c(1,2,3,1,2,3,1,2,3,1,2,3),
             test1_rank = c("top", "top", "top", "top", "mid", "bottom", "bottom", "bottom", "bottom", "top", "bottom", "bottom"),
             test2_rank = c("bottom", "bottom", "bottom", "top", "mid", "bottom", "top", "top", "top", "top", "bottom", "bottom"))

df %>%
  mutate(across(-time,
               ~ case_when(n_distinct(.) == 1 ~ paste("stable", .),
                           .[1] == "top" & .[2] == "mid" & .[3] == "bottom" ~ "gradual",
                           .[1] == "top" & .[2] == "bottom" & .[3] == "bottom" ~ "rapid",
                           .default = "other"), 
               .names = "{.col}_group"),
         .by = ID)

Output

      ID  time test1_rank test2_rank test1_rank_group test2_rank_group
   <dbl> <dbl> <chr>      <chr>      <chr>            <chr>           
 1     1     1 top        bottom     stable top       stable bottom   
 2     1     2 top        bottom     stable top       stable bottom   
 3     1     3 top        bottom     stable top       stable bottom   
 4     2     1 top        top        gradual          gradual         
 5     2     2 mid        mid        gradual          gradual         
 6     2     3 bottom     bottom     gradual          gradual         
 7     3     1 bottom     top        stable bottom    stable top      
 8     3     2 bottom     top        stable bottom    stable top      
 9     3     3 bottom     top        stable bottom    stable top      
10     4     1 top        top        rapid            rapid           
11     4     2 bottom     bottom     rapid            rapid           
12     4     3 bottom     bottom     rapid            rapid