I'm currently working on implementing LZW compression and decompression methods from FFmpeg source code to my project. What i stumbled upon is that the size of output buffer (where compressed data will be stored) needs to be bigger than size of input buffer that we want to compress. Isn't that contradictionary to the compression itself?
Next part of the code is located in ff_lzw_encode() function which is part of lzwenc.c source file.
if (insize * 3 > (s->bufsize - s->output_bytes) * 2)
{
printf("Size of output buffer is too small!\n");
return -1;
}
For my particular example, i'm trying to compress raw video frames before sending them locally. But if i allocate memory for a buffer that is size of (insize * 3) / 2
(where compressed data will be stored), wouldn't that take more time to send using send()
function than sending raw buffer which is size of insize
?
You cannot guarantee that the 'compressed' form is of less than or even equal size as the input. Think about the worst case of purely random data which cannot be compressed in any way and, best case, will be compressed to 100% its original size; in addition to that some compression metadata or escape sequences will need to be added resulting in e.g. 100% + 5 bytes.
In fact, 'compressing' incompressible data to "only" 100% it's original size is usually not happening automatically. If the algorithm just tries to compress the input normally, the result may even be significantly larger than the input. Smart compression tools detect this situation and fall back to send that chunk of data uncompressed instead, then adding some metadata to at least indicate that the chunk is uncompressed.
The buffer you have allocated must be large enough to contain the worst case number of 'compressed' bytes, hence the need for some 'headroom'.
wouldn't that take more time to send using send() function than sending raw buffer
Yes, it would. That's why you don't send the whole (allocated) buffer but only as many bytes from that buffer as the compression function indicates it has used.