Search code examples
javajava-8java-6zipinputstream

Issue while unzipping file on java upgrade to 1.8


Java version "1.8.0_40"
Java(TM) SE Runtime Environment (build 1.8.0_40-b26)

I am using core java.util.zip classes. Now while unzipping a client file using this code:

public static InputStream unzip(String file,InputStream zip)
        throws IOException {
    file = file.toLowerCase();
    ZipInputStream zin = new ZipInputStream(new BufferedInputStream(zip));
    ZipEntry ze;
    while( (ze = zin.getNextEntry()) != null ) {
        if ( ze.getName().toLowerCase().equals(file) )
            return zin;
    }
    throw new RuntimeException(file+" not found in zip");
}

I am getting following error:

invalid entry size (expected 1355916815 but got 5650884111 bytes) 

However the same code works fine in JDK 1.6.

I searched for all day but unable to find any occurrence that there are any changes corresponding to this code in Java JDK.

Please help me find suitable cause or links to support my findings.


Solution

  • Well, 1355916815 == (int) 5650884111L whereas 5650884111 is a number that can’t be expressed using the four bytes reserved for the size field of the ZIP format.

    Since you said, it worked in Java 6, which has no support for the ZIP64 format, we can conclude that you have a ZIP file that actually doesn’t support files of 5650884111 bytes, but was generated by a tool which simply ignored that limitation and stored only the lower 32 bits of the actual size.

    Apparently, the invalid file happened to work by accident due to the way, the extraction process was implemented. It works by processing the compressed bytes and verifying the resulting number of bytes with the uncompressed size stored in the header, afterwards. When the number of extracted bytes is stored in a 32 bit int variable and overflows silently during the extraction process and is only verified at the end, it appears to be the same as the stored 32 bit size.

    Since in-between Java 6 and Java 8, ZIP64 support was added, I suppose, the decoder has been changed to use a long variable now, which is reasonable, as the same decoder can be used for processing both, old ZIP and ZIP64 files. Then, the number of extracted bytes doesn’t overflow anymore and it gets noticed that the stored size 1355916815 doesn’t match the actually extracted number of 5650884111 bytes.

    Unless you need to support Java 6, (re)creating the file as valid ZIP64 file should solve the problem.

    (ZIP64 support has been added in Java 7)