I have a lot of text of words with dashes between new lines like so:
vec <- "Today is a good day because the sun is shin- ing."
What I want is instead:
"Today is a good day because the sun is shining."
But I don't want it just for specific words but for all words that are being "broken up" like that. It seems like something you should be able to do in Word format, but I haven't been able to figure out how, so maybe it's more complicated.
For the record, I am using readtext
/quanteda
package, but I can't find anything there either that can do this by default at least.
Is there some simple way to do this?
Here is one way. We can use str_replace_all
from the stringr
package.
vec <- "Today is a good day because the sun is shin- ing."
library(stringr)
vec2 <- str_replace_all(vec, "-\\s+", "")
vec2
# [1] "Today is a good day because the sun is shining."