Assume an ordered set of 100 binary values. Using a window size of 10, I would like to know the ranges (i.e., start and end position) of those windows that contain at least x "1s" (where x=3, for example).
> set.seed(123456789)
> full=rep(0,100)
> full[sample(1:100, 15)]=1
> split(full, ceiling(seq_along(full)/10))
$`1`
[1] 0 0 0 0 0 1 0 0 0 0
$`2`
[1] 0 0 1 0 0 0 0 0 0 0
$`3`
[1] 0 0 1 0 1 0 0 0 0 0
$`4`
[1] 0 0 0 0 0 0 0 1 0 0
$`5`
[1] 0 1 0 0 0 0 0 0 1 0
$`6`
[1] 0 0 0 0 0 0 0 0 0 0
$`7`
[1] 0 0 0 0 1 0 1 0 0 1
$`8`
[1] 0 0 0 0 0 1 0 0 0 0
$`9`
[1] 0 0 0 0 0 1 1 0 1 0
$`10`
[1] 0 0 0 0 0 0 0 0 0 1
Here's what I am looking for:
> desired_function(full)
61-70
81-90
An option would be to do a rolling apply function or (rollsum
) with width
10, check if there are 3 1s (binary data), get the position of logical vector with which
, convert it to buckets using cut
and get the unique
values of the bucket
library(zoo)
unique(cut(which(rollapply(full, 10, function(x) sum(x) == 3)),
breaks = c(-Inf, 11, 20, 31, 40, 51, 60),
labels = c('11-20', '21-30', '31-40', '41-50', '51-60', '61-70')))