Search code examples
pythonencryptionhashcryptographypycrypto

How to encrypt multiple files in Python 2


I've been creating a data-protection program which encrypts all files on a computer using SHA-256. So far, the program is capable of encrypting one specified file (that has been hard-coded into the program) at a time and appending a .enc extension. The only problem here is that the program creates a new file after the encryption instead of saving over the original. So if I encrypt mypass.txt, I will now have mypass.txt as well as mypass.enc, but I need it to convert mypass.txt into mypass.enc. Additionally, if anyone has any idea as to how to encrypt all files as opposed to just one that is hard-coded I would be extremely thankful. Thanks so much to anyone who has any input, please let me know if you need any additional information.

import os, random, struct
from Crypto.Cipher import AES

def encrypt_file(key, in_filename, out_filename=None, chunksize=64*1024):

    if not out_filename:
        out_filename = in_filename + '.enc'

    iv = ''.join(chr(random.randint(0, 0xFF)) for i in range(16))
    encryptor = AES.new(key, AES.MODE_CBC, iv)
    filesize = os.path.getsize(in_filename)

    with open(in_filename, 'rb') as infile:
        with open(out_filename, 'wb') as outfile:
            outfile.write(struct.pack('<Q', filesize))
            outfile.write(iv)

            while True:
                chunk = infile.read(chunksize)
                if len(chunk) == 0:
                    break
                elif len(chunk) % 16 != 0:
                    chunk += ' ' * (16 - len(chunk) % 16)

                outfile.write(encryptor.encrypt(chunk))

Solution

  • I'm assuming that you want to remove the contents of the original file as best as possible. After creating the encrypted file, you could overwrite the original file with 0 bytes, and delete it.

    Note: This is for a HDD. SSD drives can and will use a different memory block when overwrting a file for the purpose of wear levelling. So overwriting with 0-bytes is not useful on an SSD. For SSD's you should make sure that TRIM is enabled. (How that is done depends on the OS and filesystem used.) The thing is that only the SSD's controller determines when it will re-use a block of memory, obliterating the old contents. So on an SSD you cannot really be sure that file contents are gone.

    For the reasons mentioned above, I think that it is a better idea to use an encrypted filesystem for confidential data, rather than encrypting individual file. That way everything that is written to the physical device is encrypted.

    As for deleting multiple files, you have several options.

    1. Give the names of the files to be encrypted on the command line. This can be retrieved in your script as sys.args[1:].
    2. Use os.walk to recursively retrieve the paths of all files under the current working directory and encrypt them.
    3. A combination of the two. If a path in sys.args[1:] is a file (test with os.path.isfile), encrypt it. If it is a directory (test with os.path.isdir), use os.walk to find all files in that directory and encrypt them.