Search code examples
rnaperformanceanalytics

Cumulative Returns with NA's in R


I have the following data frame:

df <- data.frame(Return1=c(NA, NA, .03, .04, .05),
             Return2=c(.25, .33, NA, .045, .90),
             Return3=c(.04, .073, .08, .04, .01))


  Return1 Return2 Return3
1      NA   0.250   0.040
2      NA   0.330   0.073
3    0.03      NA   0.080
4    0.04   0.045   0.040
5    0.05   0.900   0.010

I would like to compute the cumulative returns, but there are missing values in the dataframe. I used:

cumprod(df+1)-1

Getting as a result

  Return1 Return2   Return3
1      NA  0.2500 0.0400000
2      NA  0.6625 0.1159200
3      NA      NA 0.2051936
4      NA      NA 0.2534013
5      NA      NA 0.2659354

The problem here is that if there is a NA, the subsequent rows down will have as a Result NA. Is there a way to compute the cumulative returns without NA's affecting the rest of the rows below?

I would like to obtain as a result:

  Return1 Return2   Return3
1      NA  0.2500 0.0400000
2      NA  0.6625 0.1159200
3    0.03     NA  0.2051936
4 0.07120  0.7373 0.2534013
5 0.12476  2.3008 0.2659354

I know of a function in the PerformanceAnalytics package called Return.cumulative,but this will only obtain the cumulative return of the entire columns.

Any ideas?


Solution

  • cumpfun <- function(x){
      x[!is.na(x)] <- cumprod(x[!is.na(x)]+1)-1
      x
    }
    sapply(df,cumpfun)
    
    #      Return1   Return2   Return3
    # [1,]      NA 0.2500000 0.0400000
    # [2,]      NA 0.6625000 0.1159200
    # [3,] 0.03000        NA 0.2051936
    # [4,] 0.07120 0.7373125 0.2534013
    # [5,] 0.12476 2.3008937 0.2659354
    

    Note that sapply returns a matrix. If you need a data frame, you could use sth like as.data.frame(lapply(df, cumpfun))