Search code examples
pythonpython-2.7pycrypto

compressed encrypted file is bigger then source


I created a encrypted file from a text file in python with beefish. beefish uses pycrypto.

so my source text file is 33742 bytes and the encrypted version is 33752. thats ok so far but ...

when I compress the test.enc (encrypted test file) with tar -czvf the final file is 33989 bytes. Why does the compression not work when the source file is encrypted?

So far the only option then seems to compress it first and then encrypt it cause then the file stays that small.


Solution

  • Compression works by identifying patterns in the data. Since you can't identify patterns in encrypted data (that's the whole point), you can't compress it.

    For a perfect encryption algorithm that produced a 33,742 byte output, ideally all you would be able to determine about the decrypted original data is that it can fit in 33,742 bytes, but no more than that. If you could compress it to, say, 31,400 bytes, then you would immediately know the input data was not, say, 32,000 bytes of random data since random data is patternless and thus incompressible. That would indicate a failure on the part of the encryption scheme. It's nobody's business whether the decrypted data is random or not.