Search code examples
javaziputf-16

Extracting UTF-16 encoded file from ZIP archive in Java


In the last section of the code I print what the Reader gives me. But its just bogus, where did I go wrong?

public static void read_impl(File file, String targetFile) {
    // Create zipfile input stream
    FileInputStream stream = new FileInputStream(file);
    ZipInputStream zipFile = new ZipInputStream(new BufferedInputStream(stream));

    // Im looking for a specific file/entry
    while (!zipFile.getNextEntry().getName().equals(targetFile)) {
        zipFile.getNextEntry();
    }

    // Next step in api requires a reader
    // The target file is a UTF-16 encoded text file
    InputStreamReader reader = new InputStreamReader(zipFile, Charset.forName("UTF-16"));

    // I cant make sense of what this print
    char buf[] = new char[1];
    while (reader.read(buf, 0, 1) != -1) {
        System.out.print(buf);
    }
}

Solution

  • I'd guess that where you went wrong was believing that the file was UTF-16 encoded.

    Can you show a few initial byte values if you don't decode them?