Search code examples
pythonpython-2.7encryptionipythonpycrypto

PyCrypto Possible To Check If File Already AES Encrypted?


  from Crypto.Cipher import AES

    def encrypt_file(key, in_filename, out_filename=None, chunksize=64*1024):
        """ Encrypts a file using AES (CBC mode) with the
            given key.

            key:
                The encryption key - a string that must be
                either 16, 24 or 32 bytes long. Longer keys
                are more secure.

            in_filename:
                Name of the input file

            out_filename:
                If None, '<in_filename>.enc' will be used.

            chunksize:
                Sets the size of the chunk which the function
                uses to read and encrypt the file. Larger chunk
                sizes can be faster for some files and machines.
                chunksize must be divisible by 16.
        """
        if not out_filename:
            out_filename = in_filename + '.enc'

        iv = ''.join(chr(random.randint(0, 0xFF)) for i in range(16))
        encryptor = AES.new(key, AES.MODE_CBC, iv)
        filesize = os.path.getsize(in_filename)

        with open(in_filename, 'rb') as infile:
            with open(out_filename, 'wb') as outfile:
                outfile.write(struct.pack('<Q', filesize))
                outfile.write(iv)

                while True:
                    chunk = infile.read(chunksize)
                    if len(chunk) == 0:
                        break
                    elif len(chunk) % 16 != 0:
                        chunk += ' ' * (16 - len(chunk) % 16)

                    outfile.write(encryptor.encrypt(chunk))

This is how I encrypt file, but if you run it twice or more on the same file it will keep encrypting it no questions asked, I want to add some kind of a if check if it's not already encrypted by AES? Is this possible?


Solution

  • The most often used solution is to write some "magic" string at the beginning of the encrypted file followed by the encrypted content. If that string is found when reading the file, further encryption is refused. For decription it is read to veryfiy that this is a file we encrypted, but otherwise it is ignored.

    Imagine you're using "MyCrYpT" as the magic (although it doesn't matter what you use as long as it is reasonably unique.

    magic = "MyCrYpT"
    # writing the encrypted file
    with open(out_filename, 'wb') as outfile:
        outfile.write(magic)  # write the identifier.
        outfile.write(struct.pack('<Q', filesize))  # file size
        outfile.write(iv)
        # et cetera
    

    Now, when reading the file, we read all the data, and then check if it is ours. Then we discard the magic and process the rest.

    with open(in_filename, 'rb') as infile:
        data = infile.read()
        if data[:len(magic)] != magic:
            raise ValueError('Not an encrypted file')
        filedata = data[len(magic):]
        # Proces the file data