I'm trying to use PHP7.4 to replicate a piece of python code which is using Pycryptodome to do a AES-128-CFB encryption.
For this I'm using the openssl_encrypt built-in function of PHP.
I tried several configuration parameters and CFB modes but I'm getting different results all the time.
I found out that pycryptodomes CFB implementation seems to use the 8 bit segment size, which should be the aes-128-cfb8
mode in PHP's openssl implementation.
The IV is intentionally fixed to 0, so please just ignore the fact it is unsecure.
Here is the code I want to replicate, followed by the PHP code trying to replicate the results with different approaches.
Something tells me it has to do with PHP's 'byte handling', because python distincts between a byte string (returned by .encode('utf-8')
) and string.
At the end you can see the outputs of both codes:
Python code:
import hashlib
from Crypto.Cipher import AES
key = 'testKey'
IV = '\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
ENC_KEY = hashlib.md5(key.encode('utf-8')).hexdigest()
print('key: "' + key + '"')
print('hashedKey: ' + ENC_KEY)
obj = AES.new(ENC_KEY.encode("utf8"), AES.MODE_CFB, IV.encode("utf8"))
test_data = 'test'
print('encrypting "' + test_data + '"')
encData = obj.encrypt(test_data.encode("utf8"))
print('encData: ' + encData.hex())
PHP code:
function encTest($testStr, $ENC_KEY)
{
$iv = hex2bin('00000000000000000000000000000000');
echo "aes-128-cfb8-1: ".bin2hex(openssl_encrypt($testStr, 'aes-128-cfb8', $ENC_KEY, OPENSSL_RAW_DATA, $iv))."\n";
echo "aes-128-cfb1-1: ".bin2hex(openssl_encrypt($testStr, 'aes-128-cfb1', $ENC_KEY, OPENSSL_RAW_DATA, $iv))."\n";
echo "aes-128-cfb-1: ".bin2hex(openssl_encrypt($testStr, 'aes-128-cfb', $ENC_KEY, OPENSSL_RAW_DATA, $iv))."\n";
echo "\n";
echo "aes-128-cfb8-2: ".bin2hex(openssl_encrypt($testStr, 'aes-128-cfb8', $ENC_KEY, OPENSSL_RAW_DATA|OPENSSL_ZERO_PADDING, $iv))."\n";
echo "aes-128-cfb1-2: ".bin2hex(openssl_encrypt($testStr, 'aes-128-cfb1', $ENC_KEY, OPENSSL_RAW_DATA|OPENSSL_ZERO_PADDING, $iv))."\n";
echo "aes-128-cfb-2: ".bin2hex(openssl_encrypt($testStr, 'aes-128-cfb', $ENC_KEY, OPENSSL_RAW_DATA|OPENSSL_ZERO_PADDING, $iv))."\n";
echo "\n";
echo "aes-128-cfb8-3: ".bin2hex(openssl_encrypt(utf8_encode($testStr), 'aes-128-cfb8', utf8_encode($ENC_KEY), OPENSSL_RAW_DATA|OPENSSL_ZERO_PADDING, $iv))."\n";
echo "aes-128-cfb1-3: ".bin2hex(openssl_encrypt(utf8_encode($testStr), 'aes-128-cfb1', utf8_encode($ENC_KEY), OPENSSL_RAW_DATA|OPENSSL_ZERO_PADDING, $iv))."\n";
echo "aes-128-cfb-3: ".bin2hex(openssl_encrypt(utf8_encode($testStr), 'aes-128-cfb', utf8_encode($ENC_KEY), OPENSSL_RAW_DATA|OPENSSL_ZERO_PADDING, $iv))."\n";
echo "\n";
echo "aes-128-cfb8-4: ".bin2hex(openssl_encrypt(utf8_encode($testStr), 'aes-128-cfb8', utf8_encode($ENC_KEY), OPENSSL_RAW_DATA, $iv))."\n";
echo "aes-128-cfb1-4: ".bin2hex(openssl_encrypt(utf8_encode($testStr), 'aes-128-cfb1', utf8_encode($ENC_KEY), OPENSSL_RAW_DATA, $iv))."\n";
echo "aes-128-cfb-4: ".bin2hex(openssl_encrypt(utf8_encode($testStr), 'aes-128-cfb', utf8_encode($ENC_KEY), OPENSSL_RAW_DATA, $iv))."\n";
echo "\n";
}
$key = "testKey";
$ENC_KEY = hash('md5', utf8_encode($key));
echo "ENC_KEY: ".$ENC_KEY."\n";
$test = "test";
echo "encrypting \"".$test."\"\n";
encTest($test, $ENC_KEY);
Python output (encData
should be replicated):
key: "testKey"
hashedKey: 24afda34e3f74e54b61a8e4cbe921650
encrypting "test"
encData: 117c1974
PHP output:
key: "testKey"
hashedKey: 24afda34e3f74e54b61a8e4cbe921650
encrypting "test"
aes-128-cfb8-1: b0016a55
aes-128-cfb1-1: bac44c56
aes-128-cfb-1: b0f1c27a
aes-128-cfb8-2: b0016a55
aes-128-cfb1-2: bac44c56
aes-128-cfb-2: b0f1c27a
aes-128-cfb8-3: b0016a55
aes-128-cfb1-3: bac44c56
aes-128-cfb-3: b0f1c27a
aes-128-cfb8-4: b0016a55
aes-128-cfb1-4: bac44c56
aes-128-cfb-4: b0f1c27a
In the PHP code (more precisely for openssl_encrypt
), the AES variant is specified explicitly, e.g. as in the current case with aes-128-...
, i.e. PHP uses AES-128. A key that is too long is truncated, a key that is too short is padded with 0
values. Since the hash
method in the PHP code returns its result as hex string, the 16 bytes MD5 hash is represented by 32 characters (32 bytes), i.e. in the current case PHP uses the first 16 bytes of the key (AES-128).
The hexdigest
method in the Python code also returns the result as hex string. However, in the Python code (more precisely for PyCryptodome), the AES variant is specified by the keysize, i.e. the Python code uses the full 32 bytes key and thus AES-256.
The different keys and AES variants are the main reason for the different results. To fix this issue, the same keys and AES variants must be used in both codes:
Option 1 is to use AES-128 in the Python code as well. This can be achieved by the following change:
obj = AES.new(ENC_KEY[:16].encode("utf8"), AES.MODE_CFB, IV.encode("utf8"))
Then the output b0016a55
is in accordance with the result of the PHP code for aes-128-cfb8
.
Option 2 is to also use AES-256 in the PHP code. This can be done by replacing aes-128...
with aes-256...
Then the output is
aes-256-cfb8-1: 117c1974
aes-256-cfb1-1: 54096db1
aes-256-cfb-1 : 11bfdaa9
and, as expected, the output 117c1974
for aes-128-cfb8
matches the original value of the Python code.
The CFB mode changes a block cipher into a stream cipher. Thereby n
bits are encrypted in each encryption step, which is called CFBn
. For the exact details s. here.
The term CFBn
(or cfbn
) is also used in PHP, i.e. CFB1
means encryption of one bit, CFB8
of 8 bit (= one byte) and CFB
of a whole block (16 bytes). In Python, the number of bits per step is specified with segment_size
.
I.e. the counterpart of ...-cfb8
in PHP is segment_size = 8
in Python and the counterpart of ...-cfb
in PHP is segment_size = 128
in Python.
In the following it is assumed that an identical key and an identical AES variant are used in both codes.
Since segment_size = 8
is the default, the result from the Python code is the same as for ...-cfb8
from the PHP code. If segement_size = 128
in the Python code is chosen, the result is the same as for ...-cfb
in the PHP code. However, in PyCryptodome the segment_size
must be an integer multiple of 8, otherwise the error message 'segment_size' must be positive and multiple of 8 bits is displayed. For this reason the CFB1
mode is not supported by PyCryptodome.
Also note:
hash
must be set to TRUE
(default: FALSE
). In Python, simply use the digest
method instead of hexdigest
.OPENSSL_ZERO_PADDING
flag (which can be used to explicitly disable padding) makes no difference.utf8_encode
allows you to convert from ISO-8859-1 encoding to UTF-8, but since the $ENC_KEY
consists of alphanumeric characters (hex encoding) this has no effect. In general, however, arbitrary binary data (such as the result of a digest) must not be UTF8 encoded, as this would corrupt the data. There are other encodings for this purpose, such as Base64. If the results of the digest are returned in binary form (see 1st point), no UTF8 encoding may be performed.