Search code examples
zipcompressionrar

Would a compressor work if there are no equal items?


I am starting to learn compressors, and the basic idea for generic compressors is to introduce in a dictionary similar items to reduce the size of the whole thing. A example with words would be:

"I am in stack overflow.I am in stack overflow. I am in stack overflow. I am in stack overflow. Hello. I am in stack overflow. I am in stack overflow. I am in stack overflow. I am in stack overflow. Bye."

So in the Dictionary we'd have:

A:"I am in stack overflow."

AAAAHello.AAAABye.

Would a compressor reduce size if there are no similar items? Or is it even possible for there to not be similar items?


Solution

  • Yes, text can be losslessly compressed even if there are no repeating strings, so long as the symbols appear with uneven frequency. For example if only 36 of the possible 256 bytes are used in a message, then it can be compressed to 65% of its size.

    Yes, of course it's possible to have no repeating strings.