Search code examples
javascriptscalajava-13graaljs

How should I do to get the script output from the GraalJS script engine encoded correctly?


We're processing JSON in our Scala program by executing dynamically generated JavaScript code. This worked fine in Java 8 when using the included Nashorn script engine.

We have now switch to Java 13. Nashorn is no longer included, so we included GraalJS instead. It works fine, except for that international characters are handled wrong in the output. It looks like the output is translated to UTF-8 twice.

This is a short example showing the problem:

val engine = GraalJSScriptEngine.create(null,
    Context.newBuilder("js")
            .option("js.ecmascript-version", "2020")
            .option("js.script-engine-global-scope-import", "false")
);

val scriptOutput = new StringWriter()
engine.getContext.setWriter(scriptOutput)

engine.eval("print('Test åäö !');")
val out = scriptOutput.toString
println(out);

The result is: Test ᅢᆬᅢ내ᄊ !

Am I doing this wrong, or is this a bug in GraalJSScriptEngine? I cannot find any documentation on it.

Note: I have solved it temporarily by using my own StringWriter that stores the raw bytes, and then reading them again as UTF-8, but it doesn't feel like the way to do it...

val buff = ArrayBuffer[Byte]()
val scriptOutput = new StringWriter() {
    override def write(c: Int): Unit =
        buff.append(c.asInstanceOf[Byte]) 
}

// Execute JavaScript code

val out = IOUtils.toString(new InputStreamReader(new ByteArrayInputStream(buff.toArray), StandardCharsets.UTF_8))

Solution

  • You are right that this is a bug in GraalJSScriptEngine. There is a mismatch between read/write methods of Input/OutputStream and Reader/Writer. These issues should be fixed by this change. The fix will be available in GraalVM 20.2 and in the latest development builds.