I'm looking for a function in R that will given a integer allow me to split a word into that length combination but with a rolling effect.
For example function("stackoverflow", 4)
would render:
c("stac", "tack", "acko", "ckov", "kove", "over", "verf", "rflo", "flow")
Do you guys know if that function exists or must I create it?
## install.packages("zoo")
x <- unlist(strsplit("stackoverflow",""))
zoo::rollapply(x,width=4,FUN = paste0,collapse="")
# [1] "stac" "tack" "acko" "ckov" "kove" "over" "verf" "erfl" "rflo" "flow"
A function?
foo <- function(input, h) {
x <- unlist(strsplit(input,""))
zoo::rollapply(x,width=h,FUN = paste0,collapse="")
}
foo("stackoverflow", 4)
# [1] "stac" "tack" "acko" "ckov" "kove" "over" "verf" "erfl" "rflo" "flow"
A benchmark
Consider the base R approach with substring()
:
foo1 <- function(input, h) substring(input, seq_len(nchar(input)-h+1),h:nchar(input))
Let's generate a very long toy character string:
x <- paste0(rep("a",100000), collapse="")
system.time(foo(x,4))
# user system elapsed
# 2.280 0.004 2.288
system.time(foo1(x,4))
# user system elapsed
# 10.492 0.000 10.509
So, the seemingly vectorized function substring()
is not efficient at all, which is an interesting observation!