I'm seeing a strange situation with small output buffers with Java 8u45 and the java.util.Deflater.deflate(byte[] b, int off, int len, int flush)
method when used with small output buffers.
(I'm working on some low level networking code related to WebSocket's upcoming permessage-deflate
extension, so small buffers are a reality for me)
The example code:
package deflate;
import java.nio.charset.StandardCharsets;
import java.util.zip.Deflater;
public class DeflaterSmallBufferBug
public static void main(String[] args)
boolean nowrap = true;
Deflater deflater = new Deflater(Deflater.DEFAULT_COMPRESSION,nowrap);
byte[] input = "Hello".getBytes(StandardCharsets.UTF_8);
System.out.printf("input is %,d bytes - %s%n",input.length,getHex(input,0,input.length));
byte[] output = new byte[input.length];
// break out of infinite loop seen with bug
int maxloops = 10;
// Compress the data
while (maxloops-- > 0)
int compressed = deflater.deflate(output,0,output.length,Deflater.SYNC_FLUSH);
System.out.printf("compressed %,d bytes - %s%n",compressed,getHex(output,0,compressed));
if (compressed < output.length)
System.out.printf("Compress success");
System.out.printf("Exited compress (maxloops left %d)%n",maxloops);
private static String getHex(byte[] buf, int offset, int len)
StringBuilder hex = new StringBuilder();
for (int i = offset; i < (offset + len); i++)
if (i > offset)
hex.append(' ');
return hex.toString();
In the above case, I'm attempting to generate compressed bytes for the input "Hello"
using an output buffer of 5 bytes in length.
I would assume the following resulting bytes:
buffer 1 [ F2 48 CD C9 C9 ]
buffer 2 [ 07 00 00 00 FF ]
buffer 3 [ FF ]
Which translates as
[ F2 48 CD C9 C9 07 00 ] <-- the compressed data
[ 00 00 FF FF ] <-- the deflate tail bytes
However, when Deflater.deflate()
is used with a small buffer, this normal loop continues infinitely at 5 bytes of compressed data (seems to only manifest at buffers of 5 bytes or lower).
Resulting output of running the above demo ...
input is 5 bytes - [48 65 6C 6C 6F]
compressed 5 bytes - [F2 48 CD C9 C9]
compressed 5 bytes - [07 00 00 00 FF]
compressed 5 bytes - [FF 00 00 00 FF]
compressed 5 bytes - [FF 00 00 00 FF]
compressed 5 bytes - [FF 00 00 00 FF]
compressed 5 bytes - [FF 00 00 00 FF]
compressed 5 bytes - [FF 00 00 00 FF]
compressed 5 bytes - [FF 00 00 00 FF]
compressed 5 bytes - [FF 00 00 00 FF]
compressed 5 bytes - [FF 00 00 00 FF]
Exited compress (maxloops left -1)
If you make the input/output larger than 5 bytes then the problem seems to go away. (Just make the input string "Hellox"
to test this for yourself)
Results of making the buffer 6 bytes (input as "Hellox"
input is 6 bytes - [48 65 6C 6C 6F 78]
compressed 6 bytes - [F2 48 CD C9 C9 AF]
compressed 6 bytes - [00 00 00 00 FF FF]
compressed 5 bytes - [00 00 00 FF FF]
Compress success
Even these results are bit quirky to me, as it seems there's 2 deflate tail-byte sequences present.
So, I guess my ultimate question is, am I missing something about the Deflater
usage that making thing odd for me, or is this pointing at a possible bug in the JVM Deflater
implementation itself?
Update: Aug 7, 2015
This discovery has been accepted as bugs.java.com/JDK-8133170
This is a zlib "feature", documented in zlib.h:
In the case of a Z_FULL_FLUSH or Z_SYNC_FLUSH, make sure that avail_out is greater than six to avoid repeated flush markers due to avail_out == 0 on return.
What is happening is that each call of deflate()
is inserting a five-byte flush marker. Since you are not providing enough output space to get the marker, you call again to get more output, but are asking it to insert another flush marker at the same time.
What you should be doing is calling deflate()
once, and then getting all of the available output with additional deflate()
calls, if necessary, that use Z_NO_FLUSH
in Java).