Search code examples
rtm

(R) "Text Mining" how to see the detail information in <<PlainTextDocument>>?


Just start my learning about text mining, followed the book, I used tm::inspect() to see the first information in data "crude", but unlike the example on that book, R showed me the following things instead of the detail information like the book said.

I want to know why would this happened? and how could I correct it? Thanks! (Sorry for my poor English lol..)

my code:

library(tm)
data(crude) 
inspect(crude[1])                                        
summary(crude)

and the output:

> inspect(crude[1])
<<VCorpus>>
Metadata:  corpus specific: 0, document level (indexed): 0
Content:  documents: 1

$`reut-00001.xml`
<<PlainTextDocument>>
Metadata:  15
Content:  chars: 527

> summary(crude)
    Length Class             Mode
127 2      PlainTextDocument list
144 2      PlainTextDocument list
191 2      PlainTextDocument list
194 2      PlainTextDocument list
211 2      PlainTextDocument list
236 2      PlainTextDocument list
237 2      PlainTextDocument list

Solution

  • Could it be that you forgot a pair of square brackets?

    library(tm)
    data("crude")
    
    inspect(crude[[1]])
    

    for me, it prints out the following:

    <<PlainTextDocument>>
    Metadata:  15
    Content:  chars: 527
    
    Diamond Shamrock Corp said that
    effective today it had cut its contract prices for crude oil by
    1.50 dlrs a barrel.
        The reduction brings its posted price for West Texas
    Intermediate to 16.00 dlrs a barrel, the copany said.
        "The price reduction today was made in the light of falling
    oil product prices and a weak crude oil market," a company
    spokeswoman said.
        Diamond is the latest in a line of U.S. oil companies that
    have cut its contract, or posted, prices over the last two days
    citing weak oil markets.
     Reuter