Search code examples
r

How to produce a table.file for input in ReadMe: Software for Automated Content Analysis


I'm trying to use the ReadMe package by Hopkins et al. but can't get it to work with my own data. The demo is running fine and I converted my data as the package expects it to be (individual txt files for every text and a control.txt file with the true labels and so on). What I didn't manage to do is to create the table.file. The table file contains a table of word frequencies and the demo file looks like this when opened in Excel:

table.file

Unfortunatly there doesn't seem any documentation on how to create such a table.file whatsoever. The documentation only says:

table.file: Path of file in which table of word frequencies should be stored. Defaults to “tablefile.txt”. Of course, user must have read and write access to this file, and prior contents of file will be overwritten.

Can anybody point me to a program or code which produces such files? Or do I miss something in the documentation?


Solution

  • I searched a bit longer ans have now solved my own problem. I post it so anyone who finds this gets the crucial clues.

    The tablefile.txt is a tab seperated document term matrix with three extra columes: "FILENAME" (file name of each text to be analysed eg 'text21.txt.'), "TRUTH" (the true value of the category; can be NA for test set), "TRAININGSET" (indicates if the text belongs to the 1=training set or 0=test set).

    The document term matrix can be produced using a tutorial from the net e.g. this.