Search code examples
rdatatablezoo

rolling function using previous row values [R]


I have some sites (a,b,c) where the soil has a certain maximum capacity (100,200). For each period of time (every row in my dataset), I have calculated the change in water content (change).

set.seed(2023)
db <- data.table(ID=c(rep("a",3),rep("b",3),rep("c",3)),
                 capacity=c(rep(100,3),rep(200,6)),
                 prc=round(runif(9,1,100),0),
                 evpt=round(runif(9,1,100),0))

db[prc>=evpt,change:=prc-evpt]
db[prc<evpt,change:=exp(-((evpt-prc)/capacity))]


   ID capacity prc evpt     change
1:  a      100  39   20 19.0000000
2:  a      100  74   35 39.0000000
3:  a      100  67   19 48.0000000
4:  b      200  49   78  0.8650223
5:  b      200  41   70  0.8650223
6:  b      200  94   75 19.0000000
7:  c      200  74   51 23.0000000
8:  c      200  67   71  0.9801987
9:  c      200  77   73  4.0000000

At time 0 the site hold the maximum capacity of water. Then, if prc >evpt, change is the water added to the previous content, without exceeding the maximum capacity. If prc < evpt, change is a multiplier of the previous water content, thus reducing the content.

# draft function
new_content <- function(change,capacity,prc,evpt){
  if (prc>evpt) {
    min(change+previous_content,capacity)
    } else {
      previous_content*change
    }
}

Then I should apply the function for each site to get a new column (content) showing the temporal variation of water content. As in the following output calculated in Excel:

> db
   ID capacity prc evpt     change  content
1:  a      100  39   20 19.0000000 100.0000
2:  a      100  74   35 39.0000000 100.0000
3:  a      100  67   19 48.0000000 100.0000
4:  b      200  49   78  0.8650223 173.0045
5:  b      200  41   70  0.8650223 149.6527
6:  b      200  94   75 19.0000000 168.6527
7:  c      200  74   51 23.0000000 200.0000
8:  c      200  67   71  0.9801987 196.0397
9:  c      200  77   73  4.0000000 200.0000

The problem I cannot solve is how to define the variable previous_content, i.e. at the beginning equal to the maximum capacity and then to the content of the previous row.

I checked the documentation of both data.table::frollapply and zoo:rollapply but I was not able even to write a starting draft for that. Any help? Thanks!


Solution

  • Assuming the input shown in the Note at the end content2 generated using Reduce is the same as content:

    db[, content2 := Reduce(function(x, i) with(.SD[i, ], {
        if (prc > evpt) min(change+x, capacity)
        else x * change
      }), 1:.N, init = capacity[1], acc = TRUE)[-1], by = ID]
    
    db
    ##    ID capacity prc evpt     change  content content2
    ## 1:  a      100  39   20 19.0000000 100.0000 100.0000
    ## 2:  a      100  74   35 39.0000000 100.0000 100.0000
    ## 3:  a      100  67   19 48.0000000 100.0000 100.0000
    ## 4:  b      200  49   78  0.8650223 173.0045 173.0045
    ## 5:  b      200  41   70  0.8650223 149.6527 149.6527
    ## 6:  b      200  94   75 19.0000000 168.6527 168.6527
    ## 7:  c      200  74   51 23.0000000 200.0000 200.0000
    ## 8:  c      200  67   71  0.9801987 196.0397 196.0397
    ## 9:  c      200  77   73  4.0000000 200.0000 200.0000
    

    In the example in the question capacity is always constant within ID and change is multiplied if it is <= 1 and added otherwise. If those are true generally we can simplify the above:

    db[, content3 := Reduce(function(x, change) 
          if (change > 1) min(change + x, capacity[1]) else x * change, 
          change, init = capacity[1], acc = TRUE)[-1], 
        by = ID]
    

    Note

    library(data.table)
    
    Lines <-  "ID capacity prc evpt     change  content
    a      100  39   20 19.0000000 100.0000
    a      100  74   35 39.0000000 100.0000
    a      100  67   19 48.0000000 100.0000
    b      200  49   78  0.8650223 173.0045
    b      200  41   70  0.8650223 149.6527
    b      200  94   75 19.0000000 168.6527
    c      200  74   51 23.0000000 200.0000
    c      200  67   71  0.9801987 196.0397
    c      200  77   73  4.0000000 200.0000"
    db <- fread(Lines)