Search code examples
rdplyrsummarysplit-apply-combine

Calculating Compounded Return by ID in R


I am trying to calculate a CAGR value, defined as (Ending/Beginning)^(1/number of years)-1.

I have a df which has columns "Stock", "date", "Annual.Growth.Rate". To quickly note: I was trying to do this using the lag function, however, I wasn't able to change the recursive formula at the beginning of each stocks. It'll make more sense looking at the dput:

structure(list(Stock = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 
2L, 2L, 2L, 2L, 2L), .Label = c("A", "B"), class = "factor"), 
    date = structure(c(6L, 2L, 3L, 4L, 5L, 1L, 12L, 8L, 9L, 10L, 
    11L, 7L), .Label = c("3/28/16", "3/29/12", "3/29/13", "3/29/14", 
    "3/29/15", "3/30/11", "6/28/16", "6/29/12", "6/29/13", "6/29/14", 
    "6/29/15", "6/30/11"), class = "factor"), Annual.Growth.Rate = c(0.1, 
    0.2, 0.1, 0.1, 0.1, 0.1, 0.3, 0.2, 0.14, 0.14, 0.14, 0.14
    ), Growth = c(110, 132, 145.2, 159.72, 175.692, 193.2612, 
    130, 156, 177.84, 202.7376, 231.120864, 263.477785), CAGR = c(0.098479605, 
    0.098479605, 0.098479605, 0.098479605, 0.098479605, 0.098479605, 
    0.125, 0.125, 0.125, 0.125, 0.125, 0.125)), .Names = c("Stock", 
"date", "Annual.Growth.Rate", "Growth.on.100", "CAGR"), class = "data.frame", row.names = c(NA, 
-12L)) 

This is the expected output. Before there was the stock, date, and growth). The growth on 100 is not all a "lag" from before. Since the first available date is multiplied by a given starter, in this case 100, (1+.1)*100, and then the following growth value is the future value (110) * the next growth rate. I can figure out how to do the CAGR using dplyr, but I'm really stuck on growth on 100.


Solution

  • You could use cumprod in a mutate. Also the starting 100 value is arbitrary. It is all a product. You can calculate the rest of the product then multiply by the starter.

    starter <- 100
    my.data <- data.frame(stock=c('a','a','a','b','b','b'), growth = c(.1,.2,.1,.1,.1,.1), date = c(1,2,3,1,2,3)) #example Data
    my.data
    my.data %>%
      group_by(stock) %>%
      mutate(growth.unit =  order_by(date,cumprod(1+growth)),
             growth = growth.unit*starter) -> new.data