library(tm)
reut21578 <- system.file("texts", "crude", package = "tm")
reuters <- Corpus(DirSource(reut21578),
readerControl = list(reader = readReut21578XML))
file <- "reut-0001.xml"
reuters <- Corpus(ReutersSource(file), readerControl = list(reader = readReut21578XML))
I am using tm package for accessing reuters data but IN ReutersSource i am getting error
Error in inherits(x, "Source") : could not find function "ReutersSource"
I think the developers have removed ReutersSource()
from the source code of the tm package.
If you want to read in a single specific file you can pass a filter expression to the DirSource()
function, like this:
reuters <- Corpus(DirSource(reut21578, pattern = "00001.xml"),
readerControl = list(reader = readReut21578XMLasPlain))
cat(content(reuters[[1]]))
Result:
Diamond Shamrock Corp said that effective today it had cut its contract prices for crude oil by 1.50 dlrs a barrel. The reduction brings its posted price for West Texas Intermediate to 16.00 dlrs a barrel, the copany said. "The price reduction today was made in the light of falling oil product prices and a weak crude oil market," a company spokeswoman said. Diamond is the latest in a line of U.S. oil companies that have cut its contract, or posted, prices over the last two days citing weak oil markets. Reuter