Search code examples
c++compressionzlib

How to work correctly with zlib::inflate_stream::write?


I'm trying to understand the small details of working with zlib::inflate_stream::write. The code is more or less like that:

zlib::z_params params;
zlib::inflate_stream inflate;

params.data_type = zlib::binary;
params.avail_in =  inBufferSize;
params.next_in  =  inBuffer;
params.avail_out = outBufferSize;
params.next_out =  outBuffer;

boost::beast::error_code boostErrorCode;
inflate.write(zParams, zlib::Flush::finish, boostErrorCode);

It isn't my code, so I prefer not to change the zlib::Flush::finish although many times the input comes in batches.

This is what I understand:
Before the call:
the parameters are as expected: avail_in is the amount of byes in the input buffer which is next_in. avail_out is the size of the output buffer - avail_out.
After the call:

  • avail_in is the amount of bytes left in the input buffer which weren't used.
  • next_in - all the bytes before this were used. This byte and the following weren't used.
  • avail_out is the amount of data written to the output buffer
  • next_out - the written data was written up to this point not including.

My questions:

  • The compressed data is stored in bits and not bytes. Can I be sure that all the bits from the bytes used in the input buffer were decomposed. Can I be sure that all the bits in the byte pointed in next_out weren't used?
  • If the input buffer contains a compressed data and after it an uncompressed data? Will zlib recognize the uncompressed data and end before it or will it produce some error?
  • If the input buffer contains two blocks of data, each compressed separately, will zlib know to difference between them and stop before the second one? Will it decompress both?
  • Assuming the input comes in chunks, how do I know whether I finished decompressing the data?

I will also appreciate discussing the above question with variation of the Flush parameter.

Thanks


Solution

    • The compressed data is stored in bits and not bytes. Can I be sure that all the bits from the bytes used in the input buffer were decomposed. Can I be sure that all the bits in the byte pointed in next_out weren't used?

    All of the bits that could be interpreted and consumed from the input bytes were interpreted and consumed. There may be some remaining bits in the last byte or two that were not consumed, because they were only a partial code, and more bits are needed from the next byte or two to decode it.

    next_out (and avail_out) refer to complete bytes. The output of decompression is bytes.

    • If the input buffer contains a compressed data and after it an uncompressed data? Will zlib recognize the uncompressed data and end before it or will it produce some error?

    Deflate streams are self-terminating. inflate will know when it ends and will stop there.

    • If the input buffer contains two blocks of data, each compressed separately, will zlib know to difference between them and stop before the second one? Will it decompress both?

    It will stop. You will need to start inflate again to decompress the second one.

    • Assuming the input comes in chunks, how do I know whether I finished decompressing the data?

    When it returns end_of_stream.