gwt encoding decoding file-encodings string-decoding

Why does this text decoded in 2 different ways not match in GWT?

I've been trying to track down why my Russian translations are not appearing correctly in the GWT version of my game. I've narrowed it down to something going wrong with the decoding of the file. This code works correctly outside of the GWT environment.

I create the UTF-8 byte array from a string for this test. The method below outputs two instances of the text to the log. The first uses new String(bytes) and gives the correct output, the second uses the BufferedReader and produces incorrect output. The diff of the two files can be seen here.

The classes I'm using for localisation are using the ByteBuffer approach and are therefore outputting incorrect text for the Russian translation and I'm struggling to understand why.

public void test(){
    String text = "# suppress inspection \"UnusedProperty\" for whole file\n" +
            "\n" +
            "# Notes\n" +
            "# I used the phrase \"Power Flower\" in English as it rhymes. They can be called something else in other languages.\n" +
            "# They're \"fleurs magiques\" (Magic Flowers) in French.\n" +
            "\n" +
            "# Tutorials\n" +
            "#-----------\n" +
            "Tutorial_1_1=Составляй слова, проводя пальцем по буквам.Сейчас попробуй создать слово  'СОТЫ'\n" +
            "Tutorial_1_2=Ты можешь складывать слова справа налево. Попробуй составить слово 'ЖАЛО' справа налево\n" +
            "Tutorial_1_3=Слова могут распологаться сверху вниз, снизу вверх, справа налево, слева направо, а также по диагонали.\n" +
            "Tutorial_1_4=Создавая слова, ты можешь изменять направление.Составь слово 'ВОСК'\n" +
            "Tutorial_1_5=Ты даже можешь пересекать свое собственное слово. Тем не менее, используй каждую букву только один раз. А сейчас, сложи слово 'УЛЕЙ'\n" +
            "Tutorial_1_6=Чем длиннее окажется твоё слово, тем больше у тебя шансов получить много очков и возможность заработать Чудо-Цветок. Составь слово 'ПЧЕЛА'\n" +
            "Tutorial_1_7=Получи Чудо-Цветы за каждое слово из пяти или более букв. Они могут быть использованы в качестве любой из букв.\n" +
            "Tutorial_1_8=Составь слово 'СТЕБЕЛЬ'\n" +
            "Tutorial_1_9=Из разных по длине и форме слов получаются разные Чудо-Цветы.\n" +
            "Tutorial_1_10=Теперь ты справишься сам. Составь еще четыре слова, чтобы уровень был пройден";

    // This defaults to the default charset, which in my instance, and most probably yours is UTF-8
    byte[] bytes = new byte[0];
    try {
        bytes = text.getBytes("UTF-8");
    } catch (UnsupportedEncodingException e) {
        e.printStackTrace();
    }

    String test = new String(bytes);
    // This is correct
    Gdx.app.log("File1", test);

    ByteArrayInputStream is = new ByteArrayInputStream(bytes);
    InputStreamReader reader = null;
    try {
        reader = new InputStreamReader(is, "UTF-8");
    } catch (UnsupportedEncodingException e) {
        e.printStackTrace();
    }

    BufferedReader br = new BufferedReader(reader);
    StringBuilder fileContents = new StringBuilder();
    String line;
    try {
        while ((line = br.readLine()) != null) {
            fileContents.append(line + "\r\n");
        }
    } catch (IOException e) {
        e.printStackTrace();
    }

    // This is incorrect
    Gdx.app.log("File2", fileContents.toString());
}

Solution

It would appear the ByteArrayInputStream and the BufferedReader partial strings are being decoded by the UTF-8 decoder which is corrupting the result. This would appear to be a GWT issue.