Search code examples
javasocketsgzipinputstreamgzipoutputstream

GZIPInputStream unable to decode at receiver side (invalid code lengths set)


I'm attempting to encode a String in a client using GZIPOutputStream then decoding the String in a server using GZIPOutputStream.

The client's side code (after the initial socket connection establishment) is:

// ... Establishing connection, getting a socket object.
// ... Now proceeding to send data using that socket:

DataOutputStream out = new DataOutputStream(socket.getOutputStream());
String message = "Hello World!";

ByteArrayOutputStream out = new ByteArrayOutputStream();
GZIPOutputStream gzip = new GZIPOutputStream(out);
gzip.write(message);
gzip.close();
String encMessage = out.toString();

out.writeInt(encMessage.getBytes().length);
out.write(encMessage.getBytes());
out.flush();

And the server's side code (again, after establishing a connection):

DataInputStream input = new DataInputStream(socket.getInputStream());

int length = input.readInt();
byte[] buffer = new byte[length];
input.readFully(buffer);

GZIPInputStream gz = new GZIPInputStream(new ByteArrayInputStream(buffer));
BufferedReader r = new BufferedReader(new InputStreamReader(gz));
String s = "";
String line;
while ((line = r.readLine()) != null) 
{
    s += line;
}

I checked and the buffer length (i.e., the coded message's size) is passed correctly, so the right number of bytes is transferred. However, I'm getting this:

java.util.zip.ZipException: invalid code lengths set
at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:164)
at java.util.zip.GZIPInputStream.read(GZIPInputStream.java:117)
at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:122)
at parsing.ReceiveResponsesTest$TestReceiver.run(ReceiveResponsesTest.java:147)
at java.lang.Thread.run(Thread.java:745)

Any ideas?

Thanks in advance for any assistance!


Solution

  • You're calling toString() on the ByteArrayOutputStream - that is incorrect, and it opens up all kinds of character encoding problems that are probably biting you here. You need to call toByteArray instead:

    byte[] encMessage = out.toByteArray();
    
    out.writeInt(encMessage.length);
    out.write(encMessage);
    

    Detail:

    if you use toString(), Java will encode your bytes in your platform default character encoding. That could be some Windows codepage, UTF-8, or whatnot. However not all characters can be encoded properly, and some will be replaced by an alternative character - a question mark perhaps. Without knowing the details, it's hard to tell.

    But in any case, encoding the byte array to a String, and then decoding it to a byte array again when you write it out, is very likely to change the data in the byte array. And there is not need to do it, you can just get the byte array straight away as shown in the code above.