Search code examples
rdataframesubtraction

how to subtract the max value from all values of a group in r


I currently have a dataframe that looks like this

tree cookie age
C1T1   A    113
C1T1   B    108
C1T1   C     97
C1T2   A    133
C1T2   B    110
C1T2   C    100

I would like to subtract the max age of each tree by the given age of each cookie (so subtract 113 from 113, 108, and 97 in C1T1 and 133 from 133, 110, and 100 in C1T2). I would then like to save the new values as a column in the same dataframe. So it would look something like:

tree cookie age new_age
C1T1   A    113  0
C1T1   B    108  5
C1T1   C    97  16

Any advice on how to do this is appreciated!


Solution

  • We can make use of the mutate to create the new coumn after grouping by 'tree' i.e. take the difference (-) of the max of the 'age' and each of the 'age' values to create the 'new_age'

    library(dplyr)
    df1 <-  df1 %>%
               group_by(tree) %>%
               mutate(new_age = max(age) - age)
    

    -output

    df1
    # A tibble: 6 x 4
    # Groups:   tree [2]
    #  tree  cookie   age new_age
    #  <chr> <chr>  <int>   <int>
    #1 C1T1  A        113       0
    #2 C1T1  B        108       5
    #3 C1T1  C         97      16
    #4 C1T2  A        133       0
    #5 C1T2  B        110      23
    #6 C1T2  C        100      33
    

    Or in base R with ave

    df1$new_age <- with(df1, ave(age, tree, FUN = max) - age)
    

    data

    df1 <- structure(list(tree = c("C1T1", "C1T1", "C1T1", "C1T2", "C1T2", 
    "C1T2"), cookie = c("A", "B", "C", "A", "B", "C"), age = c(113L, 
    108L, 97L, 133L, 110L, 100L)), class = "data.frame", row.names = c(NA, 
    -6L))