I am using R 3.1.3 on Platform: x86_64-apple-darwin13.4.0 (64-bit) and tm_0.6-2 version of package
Here is my codes following:
install.packages(c("twitterR","ROAuth","RCurl","tm","wordcloud","SnowballC"))
library(SnowballC)
library(twitteR)
library(ROAuth)
library(RCurl)
library(tm)
library(wordcloud)
#twitter authentication
consumerKey <- " "
consumerSecret <- " "
accessToken <- " "
accessTokenSecret <- " "
twitteR::setup_twitter_oauth(consumerKey,consumerSecret,accessToken,accessTokenSecret)
#retrive tweets from twitter
tweets=searchTwitter("euro2016+france",lang = "en",n=500,resultType = "recent")
class(tweets)
head(tweets)
#converting list to vector
tweets_text=sapply(tweets,function(x) x$getText())
str(tweets_text)
#creates corpus from vector of tweets
tweets_corpus=Corpus(VectorSource(tweets_text))
inspect(tweets_corpus[100])
#cleaning
tweets_clean=tm_map(tweets_corpus,removePunctuation,lazy= T)
tweets_clean=tm_map(tweets_clean,content_transformer(tolower),lazy = T)
tweets_clean=tm_map(tweets_clean,removeWords,stopwords("english"),lazy = T)
tweets_clean=tm_map(tweets_clean,removeNumbers,lazy = T)
tweets_clean=tm_map(tweets_clean,stripWhitespace,lazy = T)
tweets_clean=tm_map(tweets_clean,removeWords,c("euro2016","france"),lazy = T)
#wordcloud play with parameters
wordcloud(tweets_clean)
When I run the final line, I got:
Error in UseMethod("meta", x) : no applicable method for 'meta' applied to an object of class "try-error" In addition: Warning messages: 1: In mclapply(x$content[i], function(d) tm_reduce(d, x$lazy$maps)) : all scheduled cores encountered errors in user code 2: In mclapply(unname(content(x)), termFreq, control) : all scheduled cores encountered errors in user code
Does anyone know solution a for this?
Somehow there seems to be a encoding problem with the removeWords
function when it is used together with the tm_map
function (see also here).
A work around could be using the function earlier, at the point where you load the text into the corpus:
#converting list to vector
tweets_text=sapply(tweets,function(x) x$getText())
str(tweets_text)
# removing words
tweets_text<- sapply(tweets_text, function(x) removeWords(x, c("euro2016","france")))
tweets_text<- sapply(tweets_text, function(x) removeWords(x, stopwords("english")))
#creates corpus from vector of tweets
tweets_corpus=Corpus(VectorSource(tweets_text))
inspect(tweets_corpus[100])
#cleaning
tweets_clean=tm_map(tweets_corpus,removePunctuation)
tweets_clean=tm_map(tweets_clean,content_transformer(tolower))
#tweets_clean=tm_map(tweets_clean,removeWords,stopwords("english"))
tweets_clean=tm_map(tweets_clean,removeNumbers,lazy = T)
tweets_clean=tm_map(tweets_clean,stripWhitespace,lazy = T)
#tweets_clean=tm_map(tweets_clean,removeWords,c("euro2016","france"),lazy = T)
wordcloud(tweets_clean)