Search code examples
rtextdataframe

Exporting data frame columns into separate txt files


I chunked several novels into a data frame called documents. I want to export each chunk as a separate .txt file.

The data frame that consists of two columns. The first column has the file names for each chunk, and the second column has the actual text that would go into the file.

documents[1,1]
[1] "Beloved.txt_1"

documents[1,2]
[1] "124 was spiteful full of a baby's venom the women......"

class(documents)
[1] "data.frame"

I'm trying to write a for loop that would take each row, make the second column into a .txt file, and make the first column the name of the file. And then to iterate for each row. I've been working with something like this:

for (i in 1:ncol(documents)) {
  write(tagged_text, paste("data/taggedCorpus/",
                     documents[i], ".txt", sep=""))

I've also been reading that maybe the cat function would work well here?


Solution

  • I'm not positive this will work for you (a little more of an example of your input and desired output would help), but one issue you've got is that your for loop is by column rather than by row. If you want to do this once for every row, then it needs to be for (i in 1:nrow(documents) rather than ncol.

    Assuming that "documents" is the name of your data.frame and that the column containing the text you want to save is called "tagged_text" and the column with the file name is called "file", try this:

     for (i in 1:nrow(documents)) {
          write(documents$tagged_text[i], paste0("data/taggedCorpus/",
                     documents$file[i], ".txt"))
     }
    

    Note that you don't need to specify the path every time if you already set it before you start the loop.