Search code examples
pythonencryptioncryptographyrsapycrypto

Python Crypto, RSA Public/Private key, with large file


I now know that RSA Public/Private key can only encrypt very short input at once, but can anyone provide a way to encrypt any type of file(.txt, .phf, .exe, etc) with only the public/private key? I do not want to have additional AES key.

Here is my code, I am not getting the original content back after my encryption and decryption with the pair of public & private key. I do not care about how safe is my encryption or decryption, I just want the simple encrypt decrypt working on any input that it might take, no matter how long or large it is.

from Crypto.PublicKey import RSA
from Crypto import Random


random_generator = Random.new().read
key = RSA.generate(1024, random_generator)
public_key = key.publickey()

f = open('C:\Users\Administrator\Desktop\jack.txt','r').read()

print 'original content: '+ f

enc_data = public_key.encrypt(f, 32)
print 'encrypted data: '
print enc_data

dec_data = key.decrypt(enc_data)
print 'decrypted data: '+ dec_data

Here is the output:

original content: Python Cryptography Toolkit

A collection of cryptographic modules implementing various algorithms and protocols.

Subpackages:

Crypto.Cipher
Secret-key (AES, DES, ARC4) and public-key encryption (RSA PKCS#1) algorithms
Crypto.Hash
Hashing algorithms (MD5, SHA, HMAC)
Crypto.Protocol
Cryptographic protocols (Chaffing, all-or-nothing transform, key derivation functions). This package does not contain any network protocols.
Crypto.PublicKey
Public-key encryption and signature algorithms (RSA, DSA)
Crypto.Signature
Public-key signature algorithms (RSA PKCS#1)
Crypto.Util
Various useful modules and functions (long-to-string conversion, random number generation, number theoretic functions)
encrypted data: 
('\x08\xe3\x9d\x03\x1e\xe9(\xe2\xc7\xc6e\x0b5\x02\xc0\xd8G\x1f\xf5\xb8\x9cMC\x93Z\x982\xa5\x97\xec\xab4\x18\xc2\xc8\xd9\xd3\x99aX\xd96b\x19\x96\xdc\x1d|F\xe0\xa9\xa9\xea\x03\x10>0g\x83\xdb\xeb\xdb\x13\x91\xc6\xd8\xf6\x95\xedE@A\x0bc\xae\xbe\xbe\xf0\xde\xcc\xcexk\x10\xb3\x86\xd3\xdd\xd0\xca@T2\x9a\x8a6ut\xb1\xaf\x07\x1f\xa2M\r\xf0D\xa2`h\xc3\x89\x18\x0e\xd4\xca\xee\xf5\xfc\x01\xed\x95}X\x1f\x13 1',)
decrypted data: ���J�rPX �����ju�a,�xm�'�]��ٟ�?y;�)��tĹ�,�D4^�ba�8����9q
+�i��l �q]Kd�Y���u��S�B���Ϲ�^�A3
.7��j��m�
�6�dl� qU

Solution

  • RSA is quite slow, so it's not really suitable for encrypting / decrypting large blocks of data. It's normally used to encrypt / decrypt the keys of a faster symmetrical cypher, eg AES.

    At a glance, your encrypted data looks a bit small to correspond to your input data, but I'm not that familiar with Crypto (I only installed it myself a few days ago), so I can't explain what it's done to your data.

    But this code works for me:

    #!/usr/bin/env python
    
    from Crypto.PublicKey import RSA
    from Crypto import Random
    
    src_data = 'To be, or not to be - that is the question.'
    print `src_data`
    
    random_generator = Random.new().read
    key = RSA.generate(1024, random_generator)
    print 'Key generated'
    
    pub_key = key.publickey()
    print 'Public key', pub_key
    
    enc_data = pub_key.encrypt(src_data, 32)[0]
    print `enc_data`
    
    dec_data = key.decrypt(enc_data)
    print `dec_data`
    

    typical output

    'To be, or not to be - that is the question.'
    Key generated
    Public key <_RSAobj @0xb7114dcc n(1024),e>
    ',\x17\xb1\x8a\x98\xb0-z\x8c\xb8r\x17\xa2\xfe[\x10I\x97\x93\x9d[\x93\x19&\\\x16V\xc2\xa3\x99\x80\xa5\x08\xafT\xb5iA|\x89\xeeJ\x90%\xceXv\x9f\x9f\xcb\\P"i\x00D\xd4\x16\xee\xa9\xe49\x18[\xa5\x0f\xd3\xfb\x91\xd5\x98\x1bP\xbf\xa4\xa5Dz\x8b7\x13\x9dqk+\xf7A\xd3\x12\x1c\x06\xcep\xf2\xba\xc6\xee\xf8\xa2\xb4\x04v\xfb\xb7>\xb3U\x17\xban\xf7\xc0oM+Tq\xce\xe3D\x83\xb9\xa4\x90\xe6c,\x18'
    'To be, or not to be - that is the question.'
    

    FWIW, here's a slightly modified version of the above which runs on both Python 2 and Python 3, although there will be minor differences in the output of the two versions.

    In Python 3 we cannot pass strings to the encryption or decryption functions, we must pass bytes. Also, Python 3 doesn't support the backticks syntax that gets the repr of an object in Python 2.

    This code call the string .encode() and bytes .decode() methods to perform the conversions. We could specify an encoding codec, eg

    src_data.encode('utf-8')
    

    but that's not necessary, since UTF-8 is the default codec.

    from __future__ import print_function
    
    from Crypto.PublicKey import RSA
    from Crypto import Random
    
    src_data = 'To be, or not to be - that is the question.'
    print(repr(src_data))
    
    random_generator = Random.new().read
    key = RSA.generate(1024, random_generator)
    pub_key = key.publickey()
    print('Key generated')
    
    print(key.exportKey().decode())
    print(pub_key.exportKey().decode())
    
    enc_data = pub_key.encrypt(src_data.encode(), 32)[0]
    print('Encoded\n', repr(enc_data))
    
    dec_data = key.decrypt(enc_data).decode()
    print('Decoded\n', repr(dec_data))    
    

    Typical Python 3 output

    'To be, or not to be - that is the question.'
    Key generated
    -----BEGIN RSA PRIVATE KEY-----
    MIICXAIBAAKBgQDL/TzI4yHmlcC8qP3xWNieujmXR7CnEaZJrDH1Hyr/tGNa0aEE
    jDIz+RlMntBbhOuiQMkMtCSB5X28h7HetiD4XkWTXmlIiKZQLZ074cO5mxF+HhF7
    WIG30VONpX+Q4t/beqtaqbzyeIWvDdcCjUwOSQLrUKU5PX9LFzX+FnN1UwIDAQAB
    AoGASRVZib+Wjb5pZy5EjQt/0J53s7ODnte78/k1jNS12xcN4aPpRG/WLLi6T7E2
    hROCOIdtgJep3MAT5E/ZciledflaDwwmLo6+NsrhMppsNhpIHsvxWxmwxnH+bC2H
    lpyeUmxku4xzqwYW4kuF3iaR45K2eUpXQyWTE9+pgvepgoECQQDT6Waiavstvs4W
    btW2P4J+7O//NmTcvnyLTnhFTXklxTxnSun54HYOce8+TttsXWESTbzf91saN5SW
    0vPyKK25AkEA9m3gbwFppiA65464jnDCP1AnrR15n3tbsLpaadYdRm07b+4BB0R2
    M820cgber14JiGndOfv1uPl1Ooy0IH4hawJBAJKRC/uqIrAxGDlLz2SN6KQBHL1X
    0csbtOhlDaphOzl0gaKvncTGCuFSzDY8NGdu7oTKX6hIXSp05sCqhy8mE4ECQE49
    xKx5/llIkmtC3UYcdxAzGuXUHfGM8SfDg0FnQhRCSkTXhGwSSJVaEpjBpaJ4cP5m
    3l6yqOn6CkZ0thw679ECQCWNC5hVEtsAb0TcjGdTpw+xTFusiZciNyHTQ64Zq2cc
    ehQrxTRDIEBA4wIgUwrTwdVXk10OtpcVZvLIVjqdC84=
    -----END RSA PRIVATE KEY-----
    -----BEGIN PUBLIC KEY-----
    MIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQDL/TzI4yHmlcC8qP3xWNieujmX
    R7CnEaZJrDH1Hyr/tGNa0aEEjDIz+RlMntBbhOuiQMkMtCSB5X28h7HetiD4XkWT
    XmlIiKZQLZ074cO5mxF+HhF7WIG30VONpX+Q4t/beqtaqbzyeIWvDdcCjUwOSQLr
    UKU5PX9LFzX+FnN1UwIDAQAB
    -----END PUBLIC KEY-----
    Encoded
     b'\x843\x9aJ\xe6\x91p\xd2\x9c\xd0r{37\xa2G\x13Q\xc7~\xbd5\xce\x9f\xd4\x16\xda\x11\x02.\xb7\xf1\xf3Q\x8c|\xb0R2B\x1b)e\xeaD\x8e\x11\x1b\xb0J:\xbal\xac\x86\xdcb}_\x16IX\xccd\x0c\xb5E?Im<\x04ORT\xc9\xc6K|;\xf3\xbcK\xfd\x89\x96ZF(\x0b\x82v\x19`\xc3\xa1N\x934*\x9c\xfcT\xf4i\x02g\x1fl\xec\xc1\x19z\x9f7\xa6}\xe2\xe3}\xaa|\x1e\x13z\xd9$\xea'
    Decoded
     'To be, or not to be - that is the question.'
    

    We don't really need to use UTF-8 encoding here. Since src_data is a pure 7-bit ASCII string, and we've embedded it into the script as a literal, we could have supplied it as a literal bytes string instead:

    src_data = b'To be, or not to be - that is the question.'