When trying to reproduce the example found in http://tidytextmining.com/twitter.html there's a problem.
Basically I want to adapt this part of the code
library(tidytext)
library(stringr)
reg <- "([^A-Za-z_\\d#@']|'(?![A-Za-z_\\d#@]))"
tidy_tweets <- tweets %>%
mutate(text = str_replace_all(text, "https://t.co/[A-Za-z\\d]+|http://[A-Za-z\\d]+|&|<|>|RT", "")) %>%
unnest_tokens(word, text, token = "regex", pattern = reg) %>%
filter(!word %in% stop_words$word,
str_detect(word, "[a-z]"))
in order to keep the stop_Word included dataframe of tweets.
So i tried this :
tidy_tweets <- tweets %>%
mutate(text = str_replace_all(text, "https://t.co/[A-Za-z\\d]+|http://[A-Za-z\\d]+|&|<|>|RT", "")) %>%
unnest_tokens(word, text, token = "regex", pattern = reg)
tidy_tweets_sw <- filter(!word %in% stop_words$word, str_detect(tidy_tweets, "[a-z]"))
But that did not work as i got the following error message :
Error in match(x, table, nomatch = 0L) :
'match' requires vector arguments
I have tried to pass a vector version of both inputs to match, but to no avail. Does anyone have a better idea?
Unsure but I think your problem is here:
tidy_tweets_sw <- filter(!word %in% stop_words$word, str_detect(tidy_tweets, "[a-z]"))
filter
has no clue about what you want to filter at all, this should work:
tidy_tweets_sw <- tidy_tweets %>% filter(!word %in% stop_words$word, str_detect(tidy_tweets, "[a-z]"))