Search code examples
rsearchbooleanboolean-logicboolean-operations

Is there a series of `n` elements that satisfy a condition wrapped betwee two series of `m` elements that satisfy another condition in `x`?


This question comes as a follow-up to these excellent answers.

From the answer I linked above, one can calculate, from a vector of numeric x if there is any series of at least n elements that satisfy a condition (being bigger than 50 for example) where the series of n elements is wrapped in between at least one series on each side of at least m elements that do not satisfy this same condition (see the post above for more information). My goal is to generalize this function to allow different conditions for the series of n elements than for the series of m elements. Below I am considering the example of one of the two answers the the linked post but it might be easier to modify the function from the other answer to make the generalization.

### Function ###

runfun = function(TFvec, list_n, cond=`>=`) {
  ## setup
  n = length(list_n)
  r = rle(TFvec); l = r$length

  ## initial condition
  idx = which(cond(l, list_n[1]) & r$value)
  idx = idx[idx > n - 1 & idx + n - 1 <= length(l)]

  ## adjacent conditions
  for (i in seq_len(n - 1)) {
      if (length(idx) == 0)
          break     # no solution
      thresh = list_n[i + 1]
      test = cond(l[idx + i], thresh) & cond(l[idx - i], thresh)
      idx = idx[test]
  }

  ## starts = cumsum(l)[idx - 1] + 1
  ## any luck?
  length(idx) != 0
  }

### Examples ###

x = c(20, 11, 52, 53, 10, 2, 3, 51, 34, 54, 29)
n = 2
m = 3
runfun(TFvec = x>50, list_n = list(n,m)) # FALSE

x = c(20, 11, 44, 52, 53, 10, 2, 3, 51, 34, 54, 29)
n = 2
m = 3
runfun(TFvec = x>50, list_n = list(n,m)) # TRUE

I am now trying to push this function a bit further by allowing to find a series of at least n elements that satisfy a condition wrapped around at least one series on each side of at least m elements that satisfy another condition. Something like:

runfun2(TFvec = list(x > 50, x < 40), list_n = list(n,m))

would return TRUE if there is at least one series of at least n elements that are large than 50 in x and if this series is wrapped between at least two series (one on each side) of at least m elements that are smaller than 40 in x.

TFvec now is a list of the same length than list_n. For the special case where the elements of the list of TFvec are identical runfun2 does the same thing as runfun. For simplicity, one can assume that an element of x can never be true under the two (or more) possible conditions.


Solution

  • Like this, perhaps:

    f<-function(mcond,ncond,m,n){
      q<-rep(0,length(mcond))
      q[ncond]<-2
      q[mcond]<-1
    
      r<-rle(q)
      possible<-which(r$values==1
                 & c(r$values[-1],0)==2
                 & c(0,head(r$values,-1))==2
                 )
      possible<-possible[r$lengths[possible]>=m &
                         r$lengths[possible+1]>=n &
                         r$lengths[possible-1]>=n]
      list(start=1+cumsum(r$lengths)[possible-1],length=r$lengths[possible])
    }
    

    Example:

    > set.seed(123)                                        
    > x<-sample(100,300,T)
    > f(x>50,x<40,3,2)
    $start
    [1]  20 294
    
    $length
    [1] 9 4
    
    > x[18:30]
     [1]   5  33  96  89  70  65 100  66  71  55  60  29  15
    > x[292:299]
    [1] 11  8 89 76 82 99 11 10