Search code examples
rtext-miningapostrophe

Remove special apostrophe in R


I am doing some text mining and I would like to remove the apostrophe " from my text (delete it). I tried to use gsub as follow but it does not work

text <- "\"branch"

removeSpecialChars <- function(x){
     result <- gsub('"',x)
     return(result)
}

without <- removeSpecialChars(text)

The desired Output would be branch and not "branch. Thanks for your help

EDIT to go further (i am trying to clean a text).

The Input is a list conatining a lot of different string. For example

Input <- list(c("e","b", "stackoverflow", "\"branch"))

cleanCorpus <- function(corpus){
  corpus.tmp <- tm_map(corpus, removePunctuation,preserve_intra_word_dashes = TRUE)

  removeSpecialChars <- function(x){
    result <- gsub('"', "",x)
    return(result)
  }
  corpus.tmp <- removeSpecialChars(corpus.tmp)

  corpus.tmp <- tm_map(corpus.tmp, stripWhitespace)
  corpus.tmp <- tm_map(corpus.tmp, content_transformer(tolower))
  corpus.tmp <- tm_map(corpus.tmp, removeWords, stopwords("english"))
  return(corpus.tmp)
}
result <- cleanCorpus(Input)

Solution

  • We need to use the replacement

    gsub('"', "", text)
    #[1] "branch"
    

    data

    text <- "\"branch"