Search code examples
javartmrjava

Static Java function imported with rJava doesn't work with tm_map()


I've prepared a class with a static method in Java 6, which I've exported to a JAR file:

package pl.poznan.put.stemutil;

public class Stemmer {
    public static String stemText(String text) {
        Set<String> c = new HashSet<String>();
        ...
        return StringUtils.join(c, " ");
    }
}

I import it to R with following code:

require(rJava)
.jinit("java/stem-util.jar")
stem = J("pl.poznan.put.stemutil.Stemmer")$stemText

Then, when I call it directly it works, e.g:

> stem("płotkami")
[1] "płotek płotka"

But when I'll try to use it with tm_map() function, something goes wrong:

> vc = VCorpus(vs, readerControl = list(language = "pl"))
> vc[[1]]
<<PlainTextDocument (metadata: 7)>>
 mirki mirkówny zaczynam wolne jutra ( ͡° ͜ʖ ͡°) #pijzwykopem #piwozlidla
> vc = tm_map(vc, stem)
Komunikat ostrzegawczy:
In mclapply(content(x), FUN, ...) :
  all scheduled cores encountered errors in user code
> vc[[1]]
[1] "Error in FUN(X[[1L]], ...) : \n  Sorry, parameter type `NA' is ambiguous or not supported.\n"
attr(,"class")
[1] "try-error"
attr(,"condition")
<simpleError in FUN(X[[1L]], ...): Sorry, parameter type `NA' is ambiguous or not supported.>

What am I doing incorrectly?


Solution

  • Finally adding mc.cores parameter has worked for me. However, It's more a workaround, than a proper solution.

    vc = tm_map(vc, content_transformer(stem), mc.cores=1)