Search code examples
rdplyr

Applying different functions to different groups using 2 data frames


I was once decent at R years ago while working on my graduate research but have unfortunately lost my skills.

I am wanting to apply two different functions based on the species and make a new column in a data frame with results from each function.

Here is an example of my data:

df1 <- data.frame(Species= c("RGN","RGN","BRK","BRK"),
                  Length = c(191,193,184,167),
                  Weight = c(82,84,83,87))
df2 <- data.frame(Name = c("Cutthroat Trout","Brook Trout"),
                  Species = c("RGN","BRK"),
                  Slope = c(3.086,3.103),
                  Int = c(-5.192,-5.186))

I want to add a new column in df1 called Ws (df1$Ws) based on a function of values taken from df2 depending on the species. The function is as follows:

df1$Ws <- 10^(df2$Int + df2$Slope * log10(df1$Length))

The problem is the function will change depending on the species. I think I need to use a combination of if statements and groupby but I am stumped.


Solution

  • We assume that when the question refers to two different functions it means the same functional form but with different values of Int and Slope from df2 depending on Species.

    Join the two data frames and then apply the transformation so that each row uses the appropriate Int and Slope from df2 .

    df1 %>% 
      left_join(df2, join_by(Species)) %>%
      mutate(Ws = 10^(Int + Slope * log10(Length))) %>%
      select(Species, Length, Weight, Ws)
    

    giving

      Species Length Weight       Ws
    1     RGN    191     82 70.35079
    2     RGN    193     84 72.64904
    3     BRK    184     83 69.45919
    4     BRK    167     87 51.41493