I'm working on mining some financial articles using tidytext, I download the data from Reuters but then when I'm trying to turn each corpus into a data frame I get some errors about unnest command not taking functions as input...
Do you have any alternatives to get this into a tibble?
library(tm.plugin.webmining)
library(purrr)
company <- c("Microsoft", "Apple", "Google", "Amazon", "Facebook",
"Twitter", "IBM", "Yahoo", "Netflix")
symbol <- c("MSFT", "AAPL", "GOOG", "AMZN", "FB", "TWTR", "IBM", "YHOO", "NFLX")
download_articles <- function(symbol) {
WebCorpus(ReutersNewsSource(paste0("NASDAQ:", symbol)))
}
stock_articles <- data_frame(company = company, symbol = symbol) %>%
mutate(corpus = map(symbol, download_articles))
stock_articles
stock_tokens <- stock_articles %>%
unnest(map(corpus, tidy)) %>%
unnest_tokens(word, text) %>%
select(company, datetimestamp, word, id, heading)
stock_tokens
What's happening here is that some of the services have been deprecated, unfortunately, and tm.plugin.webmining is out of date. You can read some more details here. We are looking for a replacement dataset for this part of our book, but in the meantime, if you would like to explore using this code, I would recommend stripping down to just, say, 4 companies that appear to still be working.
symbol <- c("MSFT", "AAPL", "AMZN", "IBM")