Search code examples
rstringtextcharacterstringr

Changing isolated words in R?


In R, I have a character vector like:

vector<-c("BAKING CO", "NEW COBALT", "CO INC")

I would like to convert the word "CO" to "COMPANY", but only when "CO" appears as a word by itself. I do not want to change the word "cobalt." My desired output is:

vector<-c("BAKING COMPANY", "NEW COBALT", "COMPANY INC")

Is there a way to do this in R?


Solution

  • Use word boundaries:

    library(stringr)
    
    str_replace_all(vector, "\\bCO\\b", "COMPANY")
    

    In base R:

    gsub("\\bCO\\b", "COMPANY", vector)
    

    Note, if you have multiple abbreviations you want to change str_replace_all() can take a named vector for replacement:

    vector <- c("BAKING CO", "NEW COBALT", "CO INC")
    change <- c("CO" = "COMPANY", "INC" = "INCORPORATED")
    
    library(stringr)
    
    names(change) <- str_c("\\b", names(change), "\\b")
    str_replace_all(vector, change)