This seems like a very basic issue. The file path is valid and I can open the file using other means in R, but I am looking to use tm
library.
docs <- Corpus(DirSource("C:/Users/xyz/Work/test.corpus.txt"), encoding = "UTF-8"))
Throws an error of:
Error in inherits(x, "Source") : empty directory
EDIT:
This works with the original method:
docs <- Corpus(DirSource("C:/Users/xyz/Work/"), encoding = "UTF-8"))
Apparently you cannot specify an individual file name. The solution is to to read the file via another method and then use another source type such as VectorSource.
You can specify a pattern so that DirSource
only picks the files with that pattern. pattern = ".txt" for all txt files. Or if you want, pattern = "test.corpus.txt". Something like below.
docs <- Corpus(DirSource("C:/Users/xyz/Work/", pattern = "test.corpus.txt", encoding = "UTF-8")