I am trying to run a standard corpus loading method in the mallet
R package and more specifically
instance <- mallet.import(names(txt$CELEX), txt$TEXT, stoplist.file = "stopwords.en.txt", token.regexp = "\\p{L}[\\p{L}\\p{P}]+\\p{L}")
Then I get the following error
Error in .jcall("RJavaTools", "Ljava/lang/Object;", "invokeMethod", cl, :
java.lang.NullPointerException
which seems to me more like an rJava error more than anything else. My sysinfo follows:
R version 3.3.0 (2016-05-03)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1
locale:
[1] LC_COLLATE=Danish_Denmark.1252 LC_CTYPE=Danish_Denmark.1252 LC_MONETARY=Danish_Denmark.1252
[4] LC_NUMERIC=C LC_TIME=Danish_Denmark.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] mallet_1.0 XLConnect_0.2-12 XLConnectJars_0.2-12 quanteda_0.9.6-9 rJava_0.9-8
[6] topicmodels_0.2-4
loaded via a namespace (and not attached):
[1] Rcpp_0.12.5 lattice_0.20-33 slam_0.1-35 chron_2.3-47 grid_3.3.0 stats4_3.3.0
[7] stringi_1.1.1 data.table_1.9.6 NLP_0.1-9 ca_0.64 Matrix_1.2-6 tools_3.3.0
[13] parallel_3.3.0 tm_0.6-2 modeltools_0.2-21
I use Java 8 in case it matters. I read somewhere that rJava is not playing well with Java 8
That's not the usual error from rJava problems, which seem to have settled down.
One possible problem could be that the stoplist file does not exist or is not in the right place.