Search code examples
javazlibinflateadler32

Is this a bug in Java's Inflater or what?


I was bitten by this in some unit tests.

I want to decompress some ZLIB-compressed data, using Inflater, where the raw data length is known in advance.

This (straightforward) works as expected

    /*  
     * Decompresses a zlib compressed buffer, with given size of raw data.
     * All data is fed and inflated in full (one step) 
     */
    public static byte[] decompressFull(byte[] comp, int len) throws Exception {
        byte[] res = new byte[len]; // result (uncompressed)
        Inflater inf = new Inflater();
        inf.setInput(comp);
        int n = inf.inflate(res, 0, len);
        if (n != len)
            throw new RuntimeException("didn't inflate all data");
        System.out.println("Data done (full). bytes in :"  + inf.getBytesRead() 
                + " out=" + inf.getBytesWritten()
                + " finished: " + inf.finished());
        // done - the next is not needed, just for checking... 
        //try a final inflate just in case (might trigger ZLIB crc check)
        byte[] buf2 = new byte[6];
        int nx = inf.inflate(buf2);//should give 0
        if (nx != 0)
            throw new RuntimeException("nx=" + nx + " " + Arrays.toString(buf2));
        if (!inf.finished())
            throw new RuntimeException("not finished?");
        inf.end();
        return res;
    }

Now, the compressed input can come in arbitrarily-sized chunks. The following code emulates the case where the compressed input is fed in full except for the last 4 bytes, and then the remaining bytes are fed one at a time. (As I understand, the last 4 -or 5 bytes- of the zlib stream are not needed to decompress the full data, but they are needed to check the integrity - Adler-32 CRC).

    public static byte[] decompressBytexByte(byte[] comp, int len) throws Exception {
            byte[] res = new byte[len]; // result (uncompressed)
            Inflater inf = new Inflater();
            inf.setInput(comp, 0, comp.length - 4);
            int n = inf.inflate(res, 0, len);
            if (n != len)
                throw new RuntimeException("didn't inflate all data");
            // inf.setInput(comp, comp.length-4,4); 
            // !!! works if I uncomment the line befor and comment the next for 
            for (int p = comp.length - 4; p < comp.length; p++)
                inf.setInput(comp, p, 1);
            System.out.println("Data done (decompressBytexByte). bytes in :" + inf.getBytesRead() 
                    + " out=" + inf.getBytesWritten() + " finished: " + inf.finished());
            // all data fed... try a final inflate (might -should?- trigger ZLIB crc check)
            byte[] buf2 = new byte[6];
            int nx = inf.inflate(buf2);//should give 0
            if (nx != 0)
                throw new RuntimeException("nx=" + nx + " " + Arrays.toString(buf2));
            if (!inf.finished())
                throw new RuntimeException("not finished?");
            inf.end();
            return res;
        }

Well, this doesn't work for me (Java 1.8.0_181). The inflater is not finished, the Adler CRC check is not done, it seems; more: it seems that the bytes are not fed into the inflater.

Even more strange: it works if the trailing 4 bytes are fed in one call.

You can try it here: https://repl.it/@HernanJJ/Inflater-Test

Even stranger things happen when I fed the whole input one byte at a time: sometimes the line int nx= inf.inflate(buf2);//should give 0 return non-zero (when all data has already been inflated).

Is this expected behaviour? Am I missing something?


Solution

  • As @SeanBright already noticed, you are supposed to only feed it new input when Inflater.needsInput() returns true.

    A sequential call of setInput overrides your previously passed input.

    Javadoc of Inflater.needsInput():

    Returns true if no data remains in the input buffer. This can be used to determine if #setInput should be called in order to provide more input.

    As long as you feed it byte by byte that always is the case, so you can probably skip the check itself.

    You could replace the input setting part of the decompressBytexByte method, with this (for complete byte by byte feeding):

    byte[] res = new byte[len];
    Inflater inf = new Inflater();
    
    int a = 0; // number of bytes that have already been obtained
    for (int p = 0; p < comp.length; p++) {         
        inf.setInput(comp, p, 1);
        a += inf.inflate(res, a, len - a);
    }