Search code examples
rformulasarray-formulas

R find max(or min) of dynamic window of time series


I've a time series with two columns, one column contains a "signal" either NA or not (then it's an integer between 1 and 5, I don't care about the actual value unless it differs from NA) and the second column contains the actual value.

I need to calculate the highest value since the signal was not NA. This is shown in the example below.

    Date        Sig Val     Expected result
    2008-01-01  1   47      47  <<- Sig==1, i. e. here we start counting
    2008-01-02  NA  31      47
    2008-01-03  NA  61      61  <<- 61 is higher than 47, so this one is important now
    2008-01-04  NA  43      61  
    2008-01-05  NA  23      61
    2008-01-06  NA  46      61
    2008-01-07  NA  17      61
    2008-01-08  NA  52      61
    2008-01-09  NA  84      84  <<- a new high, value should be kept
    2008-01-10  NA  54      84
    2008-01-11  1   30      30  <<- a new signal, here we start counting again
    2008-01-12  NA  36      36  <<- a new higher value in this segment
    2008-01-13  NA  59      59  <<- again a new higher value in this segment
    2008-01-14  NA  56      59
    2008-01-15  NA  15      59
    2008-01-16  NA  21      59
    2008-01-17  NA  87      87
    2008-01-18  NA  81      87
    2008-01-19  2   94      94  <<- a new signal this time a 2, so here we start counting again
    2008-01-20  NA  42      94
    2008-01-21  NA  95      95
    2008-01-22  1   42      42  <<- a new signal, here we start counting again
    2008-01-23  NA  25      42
    2008-01-24  NA  20      42
    2008-01-25  NA  76      76
    2008-01-26  NA  95      95
    2008-01-27  NA  14      95
    2008-01-28  NA  12      95
    2008-01-29  NA  13      95
    2008-01-30  NA  57      95
    2008-01-31  NA  26      95

The problem is that the relevant observation window is dynamic, depending in signal-column.

I've played around with the runMax(), cummax() and rollapply() functions but they only work on a specific window length ... I guess I can't see the wood for the trees, but I can't figure out how to make the "look-back" window dynamic. Any guess?


Solution

  • Create a column that creates the signal groups, then use cummax

    Using data.table (assuming your data is in a data.frame df)

    library(data.table)
    
    D <- as.data.table(df)
    D[, list(maxvalue = cummax(value)), 
        by = list(sig2 = cumsum(replace(sig, is.na(sig), 0)))]