Search code examples
pythonencryptionaeshmaccbc-mode

HMAC failing to detect slight change on keys


Suppose the key being used to encrypt a file is this.

SDEREvalYDHK3xcuzChG7CU4hLBaoaVSvaJg_Fqo7UY=

Encryption works fine. However, a slight change on key being used is not being detected when I use hmac.compare_digest().

SDEREvalYDHK3xcuzChG7CU4hLBaoaVSvaJg_Fqo7UZ=

Notice the 2nd last character is changed from Y to Z. Decryption still works but I expect it to fail.

What could I be doing wrong? I'm using PyCryptodome module if that is of any help.

import os, hmac, hashlib
from base64 import urlsafe_b64encode, urlsafe_b64decode
from Crypto.Cipher import AES
from Crypto.Util.Padding import pad, unpad

BS = AES.block_size
class CRYPTOPRACTICE:
    def key(self):
        with open("k.txt", "rb") as k:
            return urlsafe_b64decode(k.read())

    def e(self, file):
        with open(file, "rb") as data:
            IV = os.urandom(BS)
            e = AES.new(self.key(), AES.MODE_CBC, IV).encrypt(pad(data.read(), BS))
            sig = hmac.new(self.key(), e, hashlib.sha256).digest()
            with open(file + ".encrypted", "wb") as enc:
                enc.write(IV + e + sig)
        os.remove(file)

    def d(self, file):
        with open(file, "rb") as data:
            IV = data.read(BS)
            e = data.read()[:-32]
            data.seek(-32, os.SEEK_END)
            sig = data.read()
            auth = hmac.new(self.key(), e, hashlib.sha256).digest()

            if hmac.compare_digest(sig, auth):
                d = unpad(AES.new(self.key(), AES.MODE_CBC, IV).decrypt(e), BS)
                with open(file[:-10], "wb") as dec:
                    dec.write(d)
                data.close()
                os.remove(file)
            else: print(f"Fail: {file}")

a = CRYPTOPRACTICE()
a.e("test.txt")
a.d("test.txt.encrypted")

Solution

  • You're assuming that the slight change does change the binary. However, base 64 encodes 6 bits into one character. That also means that, unless you encode a multiple of 3 bytes, that the final character before the padding characters may not encode a full 6 bits.

    In your case there is one final padding character, so that means that the last three character encode 2 * 8 = 16 bits, while they can encode 3 * 6 = 18 bits. So the last two bits (of the index into the base 64 alphabet) are generally set to zero and are otherwise ignored. Usually decoders also simply ignore the two bits encoded. So unless you make a bigger change, the character encodes exactly the same 4 bits.

    If you would have two padding characters then you even have 4 bits set to zero. And if there are no padding characters then every character must be identical, or the binary will change.