Search code examples
javaencodingzip

How do I resolve this encoding problem with ZipInputStream?


I'm doing a ZipInputStream request on a UTF-8 encoded zip file.

I get the data through OK, but special German characters are coming out wrong.

Using this page ( http://kellykjones.tripod.com/webtools/ascii_utf8_table.html ) I can see that my code is printing out the two individual chars from the UTF8 encoding column.

i.e. ä is UTF 0xC3,0xA4, and I am getting ä printed out (which are the 0xC3 and 0xA4 chars). Does anyone have any tips?

    private InputStream downloadCsv(final String countryCode) {
        final String url = baseUrl + countryCode.toUpperCase() + ".zip";
        final String fileName = countryCode.toUpperCase() + ".txt";

        BufferedInputStream in = null;
        ZipInputStream zIn = null;

        try {
            in = new BufferedInputStream(new URL(url).openStream());
            zIn = new ZipInputStream(in, Charset.forName("UTF-8"));
            
            ZipEntry zipEntry;
            
            while ((zipEntry = zIn.getNextEntry()) != null) {
                if (zipEntry.getName().equals(fileName)) {
                    StringBuilder sb = new StringBuilder();
                    
                    int c;
                    while((c = zIn.read()) != -1) {
                        sb.append((char)c);
                        System.out.println((char)c + " : " + c);
                    }

                    return new ByteArrayInputStream(sb.toString().getBytes());
                }
            }
...
more code
...

Solution

  • For the record, I fixed this using @saka1029s advice, using an InputStreamReader, and would mark it as the accepted answer if I could!

    I can't promise my code is the cleanest, but it works now:

            BufferedInputStream in = null;
            ZipInputStream zIn = null;
            InputStreamReader zInReader = null;
    
            try {
                in = new BufferedInputStream(new URL(url).openStream());
                zIn = new ZipInputStream(in);
                
                ZipEntry zipEntry;
                
                while ((zipEntry = zIn.getNextEntry()) != null) {
                    if (zipEntry.getName().equals(fileName)) {
                        StringBuilder sb = new StringBuilder();
                        zInReader = new InputStreamReader(zIn);
    
                        int c;
                        while((c = zInReader.read()) != -1) {
                            sb.append((char)c);
                        }
    
                        return new ByteArrayInputStream(sb.toString().getBytes());
                    }
                }