Search code examples
pythonphpencryptionaes

PHP7.4: OpenSSL AES-CFB encryption different to Python


I'm trying to use PHP7.4 to replicate a piece of python code which is using Pycryptodome to do a AES-128-CFB encryption. For this I'm using the openssl_encrypt built-in function of PHP. I tried several configuration parameters and CFB modes but I'm getting different results all the time. I found out that pycryptodomes CFB implementation seems to use the 8 bit segment size, which should be the aes-128-cfb8 mode in PHP's openssl implementation.

The IV is intentionally fixed to 0, so please just ignore the fact it is unsecure.

Here is the code I want to replicate, followed by the PHP code trying to replicate the results with different approaches. Something tells me it has to do with PHP's 'byte handling', because python distincts between a byte string (returned by .encode('utf-8')) and string. At the end you can see the outputs of both codes:

Python code:

import hashlib
from Crypto.Cipher import AES

key = 'testKey'
IV = '\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
ENC_KEY = hashlib.md5(key.encode('utf-8')).hexdigest()

print('key: "' + key + '"')
print('hashedKey: ' + ENC_KEY)
obj = AES.new(ENC_KEY.encode("utf8"), AES.MODE_CFB, IV.encode("utf8"))
test_data = 'test'
print('encrypting "' + test_data + '"')
encData = obj.encrypt(test_data.encode("utf8"))
print('encData: ' + encData.hex())

PHP code:

function encTest($testStr, $ENC_KEY)
{
    $iv = hex2bin('00000000000000000000000000000000');

    echo "aes-128-cfb8-1: ".bin2hex(openssl_encrypt($testStr, 'aes-128-cfb8', $ENC_KEY, OPENSSL_RAW_DATA, $iv))."\n";
    echo "aes-128-cfb1-1: ".bin2hex(openssl_encrypt($testStr, 'aes-128-cfb1', $ENC_KEY, OPENSSL_RAW_DATA, $iv))."\n";
    echo "aes-128-cfb-1: ".bin2hex(openssl_encrypt($testStr, 'aes-128-cfb', $ENC_KEY, OPENSSL_RAW_DATA, $iv))."\n";
    echo "\n";

    echo "aes-128-cfb8-2: ".bin2hex(openssl_encrypt($testStr, 'aes-128-cfb8', $ENC_KEY, OPENSSL_RAW_DATA|OPENSSL_ZERO_PADDING, $iv))."\n";
    echo "aes-128-cfb1-2: ".bin2hex(openssl_encrypt($testStr, 'aes-128-cfb1', $ENC_KEY, OPENSSL_RAW_DATA|OPENSSL_ZERO_PADDING, $iv))."\n";
    echo "aes-128-cfb-2: ".bin2hex(openssl_encrypt($testStr, 'aes-128-cfb', $ENC_KEY, OPENSSL_RAW_DATA|OPENSSL_ZERO_PADDING, $iv))."\n";
    echo "\n";

    echo "aes-128-cfb8-3: ".bin2hex(openssl_encrypt(utf8_encode($testStr), 'aes-128-cfb8', utf8_encode($ENC_KEY), OPENSSL_RAW_DATA|OPENSSL_ZERO_PADDING, $iv))."\n";
    echo "aes-128-cfb1-3: ".bin2hex(openssl_encrypt(utf8_encode($testStr), 'aes-128-cfb1', utf8_encode($ENC_KEY), OPENSSL_RAW_DATA|OPENSSL_ZERO_PADDING, $iv))."\n";
    echo "aes-128-cfb-3: ".bin2hex(openssl_encrypt(utf8_encode($testStr), 'aes-128-cfb', utf8_encode($ENC_KEY), OPENSSL_RAW_DATA|OPENSSL_ZERO_PADDING, $iv))."\n";
    echo "\n";

    echo "aes-128-cfb8-4: ".bin2hex(openssl_encrypt(utf8_encode($testStr), 'aes-128-cfb8', utf8_encode($ENC_KEY), OPENSSL_RAW_DATA, $iv))."\n";
    echo "aes-128-cfb1-4: ".bin2hex(openssl_encrypt(utf8_encode($testStr), 'aes-128-cfb1', utf8_encode($ENC_KEY), OPENSSL_RAW_DATA, $iv))."\n";
    echo "aes-128-cfb-4: ".bin2hex(openssl_encrypt(utf8_encode($testStr), 'aes-128-cfb', utf8_encode($ENC_KEY), OPENSSL_RAW_DATA, $iv))."\n";
    echo "\n";
}

$key = "testKey";
$ENC_KEY = hash('md5', utf8_encode($key));
echo "ENC_KEY: ".$ENC_KEY."\n";

$test = "test";
echo "encrypting \"".$test."\"\n";
encTest($test, $ENC_KEY);

Python output (encData should be replicated):

key: "testKey"
hashedKey: 24afda34e3f74e54b61a8e4cbe921650
encrypting "test"

encData: 117c1974

PHP output:

key: "testKey"
hashedKey: 24afda34e3f74e54b61a8e4cbe921650
encrypting "test"

aes-128-cfb8-1: b0016a55
aes-128-cfb1-1: bac44c56
aes-128-cfb-1: b0f1c27a

aes-128-cfb8-2: b0016a55
aes-128-cfb1-2: bac44c56
aes-128-cfb-2: b0f1c27a

aes-128-cfb8-3: b0016a55
aes-128-cfb1-3: bac44c56
aes-128-cfb-3: b0f1c27a

aes-128-cfb8-4: b0016a55
aes-128-cfb1-4: bac44c56
aes-128-cfb-4: b0f1c27a

Solution

  • In the PHP code (more precisely for openssl_encrypt), the AES variant is specified explicitly, e.g. as in the current case with aes-128-..., i.e. PHP uses AES-128. A key that is too long is truncated, a key that is too short is padded with 0 values. Since the hash method in the PHP code returns its result as hex string, the 16 bytes MD5 hash is represented by 32 characters (32 bytes), i.e. in the current case PHP uses the first 16 bytes of the key (AES-128).

    The hexdigest method in the Python code also returns the result as hex string. However, in the Python code (more precisely for PyCryptodome), the AES variant is specified by the keysize, i.e. the Python code uses the full 32 bytes key and thus AES-256.

    The different keys and AES variants are the main reason for the different results. To fix this issue, the same keys and AES variants must be used in both codes:

    • Option 1 is to use AES-128 in the Python code as well. This can be achieved by the following change:

      obj = AES.new(ENC_KEY[:16].encode("utf8"), AES.MODE_CFB, IV.encode("utf8"))
      

      Then the output b0016a55 is in accordance with the result of the PHP code for aes-128-cfb8.

    • Option 2 is to also use AES-256 in the PHP code. This can be done by replacing aes-128... with aes-256... Then the output is

      aes-256-cfb8-1: 117c1974
      aes-256-cfb1-1: 54096db1
      aes-256-cfb-1 : 11bfdaa9
      

    and, as expected, the output 117c1974 for aes-128-cfb8 matches the original value of the Python code.


    The CFB mode changes a block cipher into a stream cipher. Thereby n bits are encrypted in each encryption step, which is called CFBn. For the exact details s. here.

    The term CFBn (or cfbn) is also used in PHP, i.e. CFB1 means encryption of one bit, CFB8 of 8 bit (= one byte) and CFB of a whole block (16 bytes). In Python, the number of bits per step is specified with segment_size.

    I.e. the counterpart of ...-cfb8 in PHP is segment_size = 8 in Python and the counterpart of ...-cfb in PHP is segment_size = 128 in Python.

    In the following it is assumed that an identical key and an identical AES variant are used in both codes.

    Since segment_size = 8 is the default, the result from the Python code is the same as for ...-cfb8 from the PHP code. If segement_size = 128 in the Python code is chosen, the result is the same as for ...-cfb in the PHP code. However, in PyCryptodome the segment_size must be an integer multiple of 8, otherwise the error message 'segment_size' must be positive and multiple of 8 bits is displayed. For this reason the CFB1 mode is not supported by PyCryptodome.


    Also note:

    • The result of the digest can also be returned binary in both codes and not as hex string. To do this, the third parameter of the PHP method hash must be set to TRUE (default: FALSE). In Python, simply use the digest method instead of hexdigest.
    • In the PHP code, for a stream cipher mode like CFB, padding is automatically disabled, so the OPENSSL_ZERO_PADDING flag (which can be used to explicitly disable padding) makes no difference.
    • utf8_encode allows you to convert from ISO-8859-1 encoding to UTF-8, but since the $ENC_KEY consists of alphanumeric characters (hex encoding) this has no effect. In general, however, arbitrary binary data (such as the result of a digest) must not be UTF8 encoded, as this would corrupt the data. There are other encodings for this purpose, such as Base64. If the results of the digest are returned in binary form (see 1st point), no UTF8 encoding may be performed.
    • There is a bug in the legacy PyCrypto library in the context of CFB mode that requires the plaintext to have a length that is an integer multiple of the segment size. Otherwise the following error occurs: Input strings must be a multiple of the segment size 16 in length.