Search code examples
rmathmultiple-columnsshiftcontinuations

Continuous multiplication same column previous value


I have a problem. I have the following data frame.

1 2
NA 100
1.00499 NA
1.00813 NA
0.99203 NA

Two columns. In the second column, apart from the starting value, there are only NAs. I want to fill the first NA of the 2nd column by multiplying the 1st value from column 2 with the 2nd value from column 1 (100* 1.00499). The 3rd value of column 2 should be the product of the 2nd new created value in column 2 and the 3rd value in column 1 and so on. So that at the end the NAs are replaced by values.

These two sources have helped me understand how to refer to different rows. But in both cases a new column is created.I don't want that. I want to fill the already existing column 2.

Use a value from the previous row in an R data.table calculation

https://statisticsglobe.com/use-previous-row-of-data-table-in-r

Can anyone help me?

Thanks so much in advance.

Sample code

library(quantmod)
data.N225<-getSymbols("^N225",from="1965-01-01", to="2022-03-30", auto.assign=FALSE, src='yahoo') 
data.N225[c(1:3, nrow(data.N225)),]

data.N225<- na.omit(data.N225)
N225 <- data.N225[,6]


N225$DiskreteRendite= Delt(N225$N225.Adjusted)
N225[c(1:3,nrow(N225)),]

options(digits=5)
N225.diskret <- N225[,3]
N225.diskret[c(1:3,nrow(N225.diskret)),]

N225$diskretplus1 <- N225$DiskreteRendite+1
N225[c(1:3,nrow(N225)),]

library(dplyr)
N225$normiert <-"Value"
N225$normiert[1,] <-100
N225[c(1:3,nrow(N225)),]

N225.new <- N225[,4:5]
N225.new[c(1:3,nrow(N225.new)),]

Here is the code to create the data frame in R studio.

a <- c(NA, 1.0050,1.0081, 1.0095, 1.0016,0.9947)
b <- c(100, NA, NA, NA, NA, NA)
c<- data.frame(ONE = a, TWO=b)

Solution

  • You could use cumprod for cummulative product

    transform(
        df,
        TWO = cumprod(c(na.omit(TWO),na.omit(ONE)))
    )
    

    which yields

         ONE      TWO
    1     NA 100.0000
    2 1.0050 100.5000
    3 1.0081 101.3140
    4 1.0095 102.2765
    5 1.0016 102.4402
    6 0.9947 101.8972
    

    data

    > dput(df)
    structure(list(ONE = c(NA, 1.005, 1.0081, 1.0095, 1.0016, 0.9947
    ), TWO = c(100, NA, NA, NA, NA, NA)), class = "data.frame", row.names = c(NA,
    -6L))