I know that RSA can't encrypt more than 128 bytes at a time (modulus), so I am encrypting and decrypting the file in chunks. However, if my file is larger than a few kb, the result changes everytime I run the program. Sometimes the whole file is encrypted and decrypted correctly. Sometimes only the first 100 lines, etc. At this point I'm wondering if it's a reliability issue with the Crypto.PublicKey.RSA module. Here's my code:
def encrypt(file, public_key):
read_size = 128
with open(file, 'rb') as original_file:
e_file = file + '.e'
with open(e_file, 'wb') as encrypted_file:
while True:
file_part = original_file.read(read_size)
if len(filePart) == 0:
break
encrypted_file.write(public_key.encrypt(file_part, None)[0])
os.remove(file)
def decrypt(file, private_key):
read_size = 128
with open(file, 'rb') as encrypted_file:
d_file = file[:-2]
with open(d_file, 'wb') as decrypted_file:
while True:
file_part = encrypted_file.read(read_size)
if len(filePart) == 0:
break
decrypted_file.write(private_key.decrypt(file_part))
os.remove(file)
private_key = RSA.generate(1024)
public_key = RSA.importKey(private_key.publickey().exportKey())
my_file = 'myfile.txt'
encrypt(my_file, public_key)
decrypt(my_file + '.e', private_key)
EDIT:: Maarten's answer is valid. Here is a concrete example of how I solved my problem with his answer. I used this import:
from Crypto.Cipher import PKCS1_OAEP
Then instead of using directly the public key to encrypt, I used this:
cipher = PKCS1_OAEP.new(publicKey)
encryptedFile.write(cipher.encrypt(filePart))
I then did something similar for decryption.
RSA requires padding to be secure, such as OAEP or the older, less secure PKCS#1 v1.5 padding. These bring some overhead. In general this is not as big an issue as a hybrid cryptosystem is often used, where RSA is paired with a symmetric cipher. Good choices would be RSA-OAEP and AES-GCM - and you may want to sign the plaintext first as well.
If you use raw RSA then you may run into trouble if the most significant bits (on the left, RSA uses big endian / network order) are set. In that case the plaintext will be considered identical to the plaintext modulus N, the modulus used within the RSA calculation.
So for example 4 % 5 = 4 but 9 % 5 is also 4). And when you perform the decryption the answer will be 4 even if the input was 9. So depending on the first (few) bits of the plaintext block all the bits will be flipped after decryption.