Search code examples
pythonencryptionpycryptodome

Encrypting and Decrypting a string from Input() not working using Python and Pycryptodome


I'm trying to create a application that will encrypt and decrypt data based on a key generated from a password that the user used. I'm using Pycryptodome and I get this error: ValueError: Data must be padded to 16 byte boundary in CBC mode

Below is the code

from Crypto.Protocol.KDF import PBKDF2
from Crypto.Cipher import AES
from Crypto.Util.Padding import pad, unpad


class SingleEncryption:

    def __init__(self, password):
        self.password = password
        # self.salt = random data
        self.key = PBKDF2(self.password, self.salt, dkLen=32)

    def saveKey(self):

        if input(f"Would you like to save this key ({self.key}) to a file? Y/N ") == "Y":
            file_out = open(input("Where should I save it? "), "wb")
            file_out.write(self.key)
            file_out.close()
        else:
            print(f"Key not saved, the key is {self.key}")

    def encrypt(self, data):
        cipher = AES.new(self.key, AES.MODE_CBC)
        ciphered_data = cipher.encrypt(pad(data, AES.block_size))
        return ciphered_data

    def decrypt(self, cipher_data):
        cipher2 = AES.new(self.key, AES.MODE_CBC,)
        original_data = unpad(cipher2.decrypt(cipher_data), AES.block_size)
        return original_data


if __name__ == "__main__":
    encrypt = SingleEncryption(input("Encryption password? "))
    while True:
        q = input("Encrypt, Decrypt, or Quit? ")
        if q == "Encrypt":
            print(encrypt.encrypt(input("What to encrypt? ").encode()))
        elif q == "Decrypt":
            print(encrypt.decrypt(input("Encrypted data? ").encode()))
        elif q == "Quit":
            break
        else:
            print("Incorrect input, it is case sensitive.")


Solution

  • The error message Data must be padded to 16 byte boundary in CBC mode is thrown during decryption and is caused by a corrupted ciphertext due to an I/O problem, more precisely an incorrect encoding, and not, as might be assumed at first glance, by a missing padding/unpadding (the current code already pads the plaintext and unpadds the decrypted data).

    The ciphertext is output by the application as a byte string, marked with a b'' (or b""), which generally contains escape sequences of the form \xNM in addition to ASCII characters.

    If this byte string is entered during decryption, the \x are not interpreted as a marker for an escape sequence, but as part of the input text (just like b'' if this should also be entered). This corrupts the ciphertext, changing not only its content but also its length, so that it no longer corresponds to a multiple of 16 bytes, which is the actual cause of the error message.

    A simple solution to the problem is a binary-to-text encoding such as hex or Base64, e.g.:

    ...
    if q == "Encrypt":
        print(encrypt.encrypt(input("What to encrypt? ").encode()).hex())         # hex encode ciphertext
    elif q == "Decrypt":
        print(encrypt.decrypt(bytes.fromhex(input("Encrypted data? "))).decode()) # hex decode ciphertext before decryption
    ...
    

    With this change, the error message is no longer triggered, but instead a Padding is incorrect error message is generally displayed.

    This second problem is caused by the fact that encryption and decryption use different IVs. Keep in mind that the CBC mode used requires an IV (s. AES.MODE_CBC). Since no IV is specified, PyCryptodome generates a random IV during encryption and decryption (in AES.new()), which leads to the error.

    The solution is to store the IV somewhere. Commonly, the IV and the ciphertext are concatenated during encryption (and separated during decryption). The exposure of the IV is not a security problem, as the IV is not a secret:

    ...
    def encrypt(self, data):
        cipher = AES.new(self.key, AES.MODE_CBC)
        ciphered_data = cipher.encrypt(pad(data, AES.block_size))
        return cipher.iv + ciphered_data # concatenate IV and ciphertext
            
    def decrypt(self, cipher_data):
        iv = cipher_data[:16]         # separate IV...
        ciphertext = cipher_data[16:] # ...and ciphertext
        cipher = AES.new(self.key, AES.MODE_CBC, iv) # specify the IV from encryption
        original_data = unpad(cipher.decrypt(ciphertext), AES.block_size)
        return original_data
    

    A third problem may be caused by the handling of the random salt. When using a random salt (which is required for security reasons), each SingleEncryption generates a different key (since the key derivation takes place in __init__()), so that decryption is only successful if it is performed with the same SingleEncryption instance as the encryption. If this meets the requirements, then OK (although multiple encryption with the same salt is a vulnerability).

    However, if encryption and decryption should also be possible with different SingleEncryption instances, it is necessary to move the generation of the random salt and the key derivation from __init__() to encrypt(), where the salt must additionally be concatenated with the IV and ciphertext (the salt, like the IV, is not secret).
    In decrypt(), the salt, IV and ciphertext are separated, then the key is derived and finally the ciphertext is decrypted.