Search code examples
rstringvectortextstringr

Combining singleton letters?


I have the following character vector:

vec1 <- c("D R JOHNSON", "NICE W E A T H E R")

This vector has consecutive series of single letters like "D R" which I do NOT want to have spaces between them. For example, I want this vector:

 vec2 <- c("DR JOHNSON", "NICE WEATHER")

Is there any way that, when there is a consecutive series of single letters like "W E A T H E R" I can remove the spaces between these to get "WEATHER"?


Solution

  • You can use look aheads (?=) and look behinds (?<=) with boundaries (\\b):

    gsub("(?<=\\b[a-zA-Z]\\b)\\s(?=\\b[a-zA-Z]\\b)", "", vec1, perl=TRUE)
    

    "DR JOHNSON"   "NICE WEATHER"