Search code examples
compressiontheoryinformation-theory

Theory: Compression algorithm that makes some files smaller but none bigger?


I came across this question;

"A lossless compression algorithm claims to guarantee to make some files smaller and no files larger.
Is this;

a) Impossible

b) Possible but may run for an indeterminate amount of time,

c) Possible for compression factor 2 or less,

d) Possible for any compression factor?"

I'm leaning towards (a), but couldn't give a solid explanation as to why. (I'll list the thoughts a friend and I came up with as a possible answer)


Solution

  • By the pigeon-hole principle, given a string of 10 bits you have 1024 possible inputs, and need to map to 9 bits or fewer, so there are < 1024 outputs.

    This guarantees either the algorithm has collisions (lossy compression) or at some point choses to return the unmodified input as output.

    In the latter case, you cannot determine how to decompress an arbitrary string of bits. (It could be an unmodified input, or a compressed output from a larger bit string).

    -> Impossible.