Search code examples
rloopsdplyrlaglead

R How to use next row for same column calculation


I've been trying to replicate some basic Excel calculations in R but I can't find the right solution for this one.

I want to create a new column of a data.frame but the values being calculated should be based on the next values of the same column, so it is kind of being calculated from down-up.

I am able to do it using values of the same column (previous ones) but doesn't work for the next ones.

First part of my approach works fine (when using next values of another column), but not in the same column.

In excel it is simple but can't make it work.

Also, I don't know if a for loop is the best option, I have to do it with 1 million rows data.frame so it may be slow.

Thank you in advance.

p <- c(1,2,3,4,5,6,7,6,8,9,10)
g <- c(4,4,3,4,5,6,6,6,8,0,0)
data <- data.frame(p,g)
a <- 2

My approach

data$f2<- 0
for (i in 1:nrow(data)){
  data$f2[i] = data$g[i]*a+data$g[i+1]*a+data$g[i+2]*a+data$f2[i+1]*a+data$f2[i+2]*a
}


#Output
   g f2

1  4 22
2  4 22
3  3 24
4  4 30
5  5 34
6  6 36
7  6 40
8  6 28
9  8 16
10 0 NA
11 0 NA
> 

#Desired Correct Output

    g    f2
   
1     4 11232
2     4 11232
3     3 11232
4     4  4106
5     5  1498
6     6   540
7     6   192
8     6    60
9     8    16

Solution

  • base R solution

    Code:

    data$f2 <- 0
    
    for ( i in rev(seq_len(nrow(data)-2))) {
      data$f2[i] <-  with(data, sum( na.omit( c( g[c(i + 0:2)], f2[c(i + 1:2)])) * a                               
                                   ) 
                         )
    }
    

    Output

     data
     #    p g    f2
     #1:  1 4 83882
     #2:  2 4 30698
     #3:  3 3 11232
     #4:  4 4  4106
     #5:  5 5  1498
     #6:  6 6   540
     #7:  7 6   192
     #8:  6 6    60
     #9:  8 8    16
     #10:  9 0     0
     #11: 10 0     0