Search code examples
rdplyrmutate

mutate (calculate) variables from one column based on specific row conditions in another col


Here is the example dataframe:

data.frame(sample = c('A','A','A','A','A','B','B','B','B','B'),
                 measure = c(20,30,40,60,60,20,60,50,40,10),
                 time = c(1,2,3,4,5,3,4,5,6,7),
                 start = c(1,1,1,1,1,3,3,3,3,3),
                 end = c(4,4,4,4,4,6,6,6,6,6))
   sample measure time start end
1       A      20    1     1   4
2       A      30    2     1   4
3       A      40    3     1   4
4       A      60    4     1   4
5       A      60    5     1   4
6       B      20    3     3   6
7       B      60    4     3   6
8       B      50    5     3   6
9       B      40    6     3   6
10      B      10    7     3   6

What I want to do is create a new column called "calc" that calculates the measure for time=end / time=start

I'm guessing group_by sample first but I'm having trouble figuring out how to reference the start and end times. Here's what the final data should look like

   sample measure time start end calc
1       A      20    1     1   4    3
2       A      30    2     1   4    3
3       A      40    3     1   4    3
4       A      60    4     1   4    3
5       A      60    5     1   4    3
6       B      20    3     3   6    2
7       B      60    4     3   6    2
8       B      50    5     3   6    2
9       B      40    6     3   6    2
10      B      10    7     3   6    2

Solution

  • what about (d being your dataframe):

    library(dplyr)
    
    d |>
      arrange(sample, time) |>
      rowwise() |>
      mutate(is_endpoint = any(time == start, time == end)) |>
      ungroup() |>
      mutate(calc = tail(measure[is_endpoint], 1) /
               head(measure[is_endpoint], 1),
             .by = sample
             ) |>
      select(-is_endpoint)
    

    output:

     # A tibble: 10 x 6
       sample measure  time start   end  calc
       <chr>    <dbl> <dbl> <dbl> <dbl> <dbl>
     1 A           20     1     1     4     3
     2 A           30     2     1     4     3
     3 A           40     3     1     4     3
     4 A           60     4     1     4     3
     5 A           60     5     1     4     3
     6 B           20     3     3     6     2
     7 B           60     4     3     6     2
     8 B           50     5     3     6     2
     9 B           40     6     3     6     2
    10 B           10     7     3     6     2