I am currently trying to use zlib for compression in one of my projects. I had a look at the basic zlib tutorial and I am confused by the following statements:
CHUNK is simply the buffer size for feeding data to and pulling data from the zlib routines. Larger buffer sizes would be more efficient, especially for inflate(). If the memory is available, buffers sizes on the order of 128K or 256K bytes should be used.
#define CHUNK 16384
In my case I will always have a small buffer already available at the output end (around 80 bytes) and will continually feed very small data (a few bytes) from the input side through zlib. This means I will not hav a larger buffer on either side, but I am planning on using much smaller ones.
However I am not sure how to interpret the "larger buffer sizes would be more efficient". Is this referring to efficiency of the encoding or time/space efficiency?
One Idea I have to remedy this situation would be to add some more layers of buffering had have accumulate from the input and flush to the output repeatedly. However this would mean I would have to accumulate data and add some more levels of copying to my data, which would also hurt performance.
Now if efficiency is just referring to time/space efficiency, I could just measure the impact of both methods and decide on one to use. However if the actually encoding could be impacted by the smaller buffer size, this might be really hard to detect.
Does anyone have an experience on using zlib with very small buffers?
It means time efficiency. If you give inflate large input and output buffers, it will use faster inflation code internally. It will work just fine with buffers as small as you like (even size 1), but it will be slower.
It is probably worthwhile for you to accumulate input and feed it to inflate in larger chunks. You would also need to provide larger output buffers.