Search code examples
rlag

Lag function for time series


I have a question on the lag function which I am unable to solve in R. I have a variable in a dataframe 'V3' which is a time series of a very large data file. The 'resultV4' is what I want to accomplish (see the code snippet).

  1. I want my resultV4 to be -1 in the first row, based on V3.
  2. In the 2nd row of V3 there is a zero, so I want the value of resultV4 to be -1 (the value of the 1st row)
  3. In the 3rd row of V3 there is a zero, so I want the value of resultV4 to be -1 (the value of the 1st row)

When the value of V3 changes, in this case to 1 in the 6th row, I want resultV4 to be 1. The 7th row is another 0, so I want this to be the value of the 6th row in V3, which is 1. And so on..

V3<-c(-1,0,0,0,0,1,0,0,0,0,-1,0,0,0,0,-1,0,0)
resultV4<-c(-1,-1,-1,-1,-1,1,1,1,1,1,-1,-1,-1)
df<-cbind(V3,resultV4)

Thanks in advance for any suggestions.

Cheers,

PCdL


Solution

  • You can use na.locf from package zoo.

    library(zoo)
    
    V3 <- c(-1,0,0,0,0,1,0,0,0,0,-1,0,0,0,0,-1,0,0)
    V3_adj <- V3
    

    replace 0 with NA

    my_zero <- which(V3 == 0)
    
    V3_adj[my_zero] <- NA
    

    carry forward the last observation

    resultV4 <- na.locf(V3_adj)
    
    cbind(V3, V3_adj, resultV4)
    

    Result:

         V3 V3_adj resultV4
     [1,] -1     -1       -1
     [2,]  0     NA       -1
     [3,]  0     NA       -1
     [4,]  0     NA       -1
     [5,]  0     NA       -1
     [6,]  1      1        1
     [7,]  0     NA        1
     [8,]  0     NA        1
     [9,]  0     NA        1
    [10,]  0     NA        1
    [11,] -1     -1       -1
    [12,]  0     NA       -1
    [13,]  0     NA       -1
    [14,]  0     NA       -1
    [15,]  0     NA       -1
    [16,] -1     -1       -1
    [17,]  0     NA       -1
    [18,]  0     NA       -1