Search code examples
rtmreuters

ReutersSource in R


library(tm)  
reut21578 <- system.file("texts", "crude", package = "tm")  
reuters <- Corpus(DirSource(reut21578), 
                  readerControl = list(reader = readReut21578XML))  
file <- "reut-0001.xml"   
reuters <- Corpus(ReutersSource(file), readerControl = list(reader = readReut21578XML))  

I am using tm package for accessing reuters data but IN ReutersSource i am getting error

Error in inherits(x, "Source") : could not find function "ReutersSource"


Solution

  • I think the developers have removed ReutersSource() from the source code of the tm package.

    If you want to read in a single specific file you can pass a filter expression to the DirSource() function, like this:

    reuters <- Corpus(DirSource(reut21578, pattern = "00001.xml"), 
                       readerControl = list(reader = readReut21578XMLasPlain))
    
       cat(content(reuters[[1]]))
    

    Result:

    Diamond Shamrock Corp said that effective today it had cut its contract prices for crude oil by 1.50 dlrs a barrel. The reduction brings its posted price for West Texas Intermediate to 16.00 dlrs a barrel, the copany said. "The price reduction today was made in the light of falling oil product prices and a weak crude oil market," a company spokeswoman said. Diamond is the latest in a line of U.S. oil companies that have cut its contract, or posted, prices over the last two days citing weak oil markets. Reuter