I have a very long string (~1000 words) and I would like to split it into two-word phrases.
I have this:
string <- "A B C D E F"
and I would like this:
"A B"
"B C"
"C D"
"D E"
"E F"
The long string has already been cleaned and stemmed, and stop-words have been removed.
I tried to use str_split, but (I think) this needs a separator, which here is complicated because I don't want to separate A from B only "A B" from "C D", and "B C" from "D E", etc.
tmp <- strsplit(string, " ")[[1]]
tmp
# [1] "A" "B" "C" "D" "E" "F"
sapply(seq_along(tmp)[-1], function(z) paste(tmp[z-1:0], collapse = " "))
# [1] "A B" "B C" "C D" "D E" "E F"