Search code examples
rsplitoverlap

R: Split a vector to overlapping subvectors of equal length


Suppose I have a vector 1 to 10 and wish to split it into subvectors of the following two conditions:

  1. an equal length of 3.

  2. with overlapping of 1.

I got an almost-done answer Split vector with overlapping samples in R with a function which I modified bellow:

splitWithOverlap <- function(vec, seg.length, overlap) {
  starts = seq(1, length(vec), by=seg.length-overlap)
  ends   = starts + seg.length - 1
  ends[ends > length(vec)] = length(vec)

  lapply(1:length(starts), function(i) vec[starts[i]:ends[i]])
}
splitWithOverlap(1:10, 3, 2)

which produced

#[[1]]
#[1] 1 2 3

#[[2]]
#[1] 2 3 4

#[[3]]
#[1] 3 4 5

#[[4]]
#[1] 4 5 6

#[[5]]
#[1] 5 6 7

#[[6]]
#[1] 6 7 8

#[[7]]
#[1] 7 8 9

#[[8]]
#[1]  8  9 10

#[[9]]
#[1]  9 10

#[[10]]
#[1] 10    

what I want is

#[[1]]
#[1] 1 2 3

#[[2]]
#[1] 2 3 4

#[[3]]
#[1] 3 4 5

#[[4]]
#[1] 4 5 6

#[[5]]
#[1] 5 6 7

#[[6]]
#[1] 6 7 8

#[[7]]
#[1] 7 8 9

#[[8]]
#[1]  8  9 10

because the two conditions are achieved and mathematically, number of blocks = vector length - subvector length + 1 (10 - 3 + 1) = 8 and not 10

I want a modification to the function so that it will stop at subvector 8.


Solution

  • DATA

    len = 3
    ov = 1
    vec = 1:10
    

    1

    lapply(1:(length(vec) - (len - ov)), function(i){
        vec[i:(i + len - ov)]
    })
    

    2

    ind = rep(1:len, length(vec) - (len - ov))
    matrix(vec[ind + ave(ind, ind, FUN = seq_along) - 1], ncol = len, byrow = TRUE)