Search code examples
rcjkregexp-substr

R Korean regexp


I'm currently trying to remove a Korean letter where nchar is only equal to 1 in R. For example,

have = "안녕 난 철수야"
want = "안녕 철수야"

I've found the Eng version of what it is that I'm trying to achieve. how to remove words of specific length in a string in R? Can someone help me how I can do this but on Korean?

Many thanks in advance.


Solution

  • We could try

    gsub('\\s\\p{Hangul}\\s',' ', have, perl = TRUE)
    #[1] "안녕 철수야"