Search code examples
c#c++thriftzlib

How does Thrift handle Zlib flush markers being split over multiple messages?


I have an application which has a c++ server and a c# client using Apache Thrift. I use TZlibTransport.cpp for zlib compression on the server, and a wrapper that uses Ionic.Zlib to decompress the data in the client, which works most of the time.

I noticed that in very specific situations the client would crash with one of the following errors:

Thrift.Protocol.TProtocolException: Missing version in readMessageBegin, old client?
   at Thrift.Protocol.TBinaryProtocol.ReadMessageBegin()

Ionic.Zlib.ZlibException: Bad state (invalid block type)
   at Ionic.Zlib.InflateManager.Inflate(FlushType flush)

I found that in all the cases where these errors are occurring, the server was sending two packages, one just over 1024 bytes (which is the size of the compressed write buffer that TZlibTransport.cpp uses), and one of 5-8 bytes. Looking at the data on the second package, I noticed that it was the flush marker that zlib uses, added twice,

ff ff 00 00 00 ff ff

with the first part of the first marker at the end of the previous package. If I increase the size of the buffer slightly, so that it has enough space to write the marker in one package, the crash does not occur, so I believe that it is this marker being added twice that is causing the problem. It however isn't a solution to just change this buffer size, as it will mean that the error occurs at some other place in the application.

I have looked into zlib, and found that this is expected behaviour if it is not given enough space in the buffer (https://github.com/madler/zlib/issues/149). I haven't however been able to find anybody that has come across this causing a problem with thrift.

My question therefore is whether it is expected that for specific data lengths thrift will split the marker over multiple packages, and how the client is supposed to handle this.


Solution

  • It looks like the problem is not that the marker was emitted twice, but rather simply that the first marker didn't entirely fit in the buffer. Had the output been just ff ff, you would have exactly the same problem and the same error message. ff cannot start a deflate stream, because it gives an invalid block type (3).

    From your description it sounds like there is a bug in Thrift in that it does not assure and/or check that all of the compressed data actually fit in the buffer.