Search code examples
pythonencryptionaespycrypto

Cant encrypt strings with special characters with pycrypto AES


Description

I want to store people's names in MySQL database. Because the data is sensitive information i want to encrypt it with AES. I am using PyCrypto AES module. The code that I am using is:

class AESCipher(object):

    def __init__(self, key):
        self.bs = 64
        self.key = hashlib.sha256(key.encode()).digest()

    def encrypt(self, raw):
        raw = self._pad(raw)
        iv = Random.new().read(AES.block_size)
        cipher = AES.new(self.key, AES.MODE_CBC, iv)
        return base64.b64encode(iv + cipher.encrypt(raw))

    def decrypt(self, enc):
        enc = base64.b64decode(enc)
        iv = enc[:AES.block_size]
        cipher = AES.new(self.key, AES.MODE_CBC, iv)
        return self._unpad(cipher.decrypt(enc[AES.block_size:])).decode('utf-8')

    def _pad(self, s):
        return s + (self.bs - len(s) % self.bs) * chr(self.bs - len(s) % self.bs)

    @staticmethod
    def _unpad(s):
        return s[:-ord(s[len(s)-1:])]

operator = AESCipher(data_encryption_key)

The key used for encryption is a random long string.

Problem

Lets say the string (example name) i want to encrypt is "Strah". I get the folowing cipher text.

b'kA/Q5snPUHltzh3Kl8QMH/uTpfcjdXtvrx0JUrGv2tk+P86ERfkv0eTBV5j6MThkKplLLcn4f1Ei4Q1gT/FcVx+PhEnqczKhuvLzrLHYlQ4='

But if the name includes some special characters like č,š or ž and i want to encrypt a name like "Štrah" i will get the following error:

ValueError: Input strings must be a multiple of 16 in length

So the question is, what should I do to encrypt strings with special characters.


Solution

  • The problem here is that the cipher internally operates on bytes, but you're giving it a string. You're padding the string to a multiple of 16, but when that string is encoded to bytes, it's no longer a multiple a 16.

    >>> text = 'Štrah'
    >>> padded = AESCipher('')._pad(raw)
    >>> len(padded)
    64
    >>> len(padded.encode('utf8'))
    65
    

    The solution is to encode the string yourself instead of letting the cipher do it for you. You have to make 2 small changes:

    def encrypt(self, raw):
        raw = raw.encode('utf8')  # encode to bytes here
        raw = self._pad(raw)
        iv = Random.new().read(AES.block_size)
        cipher = AES.new(self.key, AES.MODE_CBC, iv)
        return base64.b64encode(iv + cipher.encrypt(raw))
    
    def _pad(self, s):
        # pad with bytes instead of str
        return s + (self.bs - len(s) % self.bs) * \
               chr(self.bs - len(s) % self.bs).encode('utf8')