I have been using pycrypto module for encryption and decryption with RSA key pair and algorithm. The problem is when I try encrypting large files (10kB of text file) I take the block size of 32 byte when reading the file and encrypting it
>>> f = open('10kb','rb')
>>> p = open('enc','wb')
>>> while True:
data = f.read(32)
if not data:
break
enc_data = public_key.encrypt(data,32)
p.write(enc_data[0])
p.close()
f.close()
It gives the output:
128
128
.......and the many 128 blocks it is writing
When I try to decrypt the encrypted file, I need to read it with 128 byte block so as to give back 32 byte blocks,
>>> f = open('enc','rb')
>>> p = open('dec','wb')
>>> while True:
data = f.read(128)
if not data:
break
dec_data = private_key.decrypt(data)
p.write(dec_data)
p.close()
f.close()
It is giving the output:
32
32
.....so many 32 byte blocks it is decrypting, then
128
128
128
128
Traceback (most recent call last):
File "<pyshell#251>", line 5, in <module>
enc_data = private_key.decrypt(data)
File "/usr/lib/python3/dist-packages/Crypto/PublicKey/RSA.py", line 174, in decrypt
return pubkey.pubkey.decrypt(self, ciphertext)
File "/usr/lib/python3/dist-packages/Crypto/PublicKey/pubkey.py", line 93, in decrypt
plaintext=self._decrypt(ciphertext)
File "/usr/lib/python3/dist-packages/Crypto/PublicKey/RSA.py", line 237, in _decrypt
cp = self.key._blind(ciphertext, r)
ValueError: Message too large
To the point where it is outputting the block size of 32, it is decrypting right, but where it starts with 128, its messing up. Why it is saying Message size too large ? Is there any better and fast way to decrypt large text files using pycrypto module ?
Partial answer coming along ...
RSA works on numbers. You only get bytes out of it when you serialize those long integers. Since those numbers don't have a fixed size, they are serialized with as much bytes as are necessary, but not more.
An RSA encryption c = me mod n can result in ciphertexts, which are so much smaller than n, that not all the bytes are filled, because leading zeros of the the number don't have to be serialized.
Sometimes (depending on modulus and plaintext) it may happen that you're writing a 127 byte chunk instead of a 128 byte chunk during encryption, but you're always reading a 128 byte chunk during decryption. That means, you're taking away one byte from the next chunk. When the alignment breaks, you can run into various random behaviors such as a chunk being larger than the modulus and therefore not a valid ciphertext.
There are two ways to solve that:
Always write the length of the ciphertext chunk before it.
Encryption:
data = f.read(readsize)
if not data:
break
i += 1
enc_data = public_key.encrypt(data, 32)[0]
p.write(chr(len(enc_data)))
p.write(enc_data)
Decryption:
length = f.read(1)
if not length:
break
data = f.read(ord(length))
print(length, len(data))
j += 1
dec_data = private_key.decrypt(data)
p.write(dec_data[:readsize])
At the end you have to reduce the ciphertext to the original plaintext size, because you're working without PKCS#1 v1.5 padding or OAEP.
Pad the zero bytes that are missing during encryption.
Encryption:
data = f.read(readsize)
if not data:
break
i += 1
enc_data = public_key.encrypt(data, 32)[0]
while len(enc_data) < writesize:
enc_data = "\x00" + enc_data
p.write(enc_data)
Decryption:
data = f.read(writesize)
if not data:
break
j += 1
dec_data = private_key.decrypt(data)
p.write(dec_data[:readsize])
Note that readsize = 127
and writesize = 128
. Here are the full source codes for both variants.
Now, this is a partial answer, because this still leads to corrupt files, which are also too short, but at least it fixes the OP's error.