Search code examples
phpnode.jsopensslcryptographyaes

Why Am I Getting Different Cipher Outputs Between Node.js and PHP When Reading Large Files by Chunks?


Inconsistent Cipher Output Between Node.js and PHP When Reading Large Files by Chunks

Problem Description

I'm facing an issue with ensuring consistent cipher output between Node.js and PHP when reading large files by chunks. The output differs even though the same values are being read and processed. Below is the code for both PHP and Node.js implementations.

PHP Code

<?php
require 'vendor/autoload.php';

define('PLAINTEXT_DATA_KEY', 'poSENHhkGVG/4fEHvhRO6j9W3goETWZAg+ZgTWxhw34=');
define('IV', "X1bIRjgIoDn/BDFhHIbg7g==");
define('ALGORITHM', 'aes-256-cbc');
define('CHUNK_SIZE', 16 * 1024);

class Cipher
{
    private function pkcs7_pad(string $data, int $blockSize)
    {
        $padLength = $blockSize - (strlen($data) % $blockSize);
        return $data . str_repeat(chr($padLength), $padLength);
    }
    public function encrypt($source, $destination)
    {
        $inputFile = fopen($source, 'rb');
        $outputFile = fopen($destination, 'wb');
        try {
            fwrite($outputFile, base64_decode(IV));

            while (!feof($inputFile)) {
                $buffer = fread($inputFile, CHUNK_SIZE);
                // Pad the last chunk if it is not the block size
                if (feof($inputFile)) {
                    $buffer = $this->pkcs7_pad($buffer, 16);
                }
                $cipherText = openssl_encrypt($buffer, ALGORITHM, PLAINTEXT_DATA_KEY, OPENSSL_NO_PADDING, base64_decode(IV));
                fwrite($outputFile, $cipherText);
            }
        } catch (Exception $e) {
            throw $e;
        } finally {
            fclose($inputFile);
            fclose($outputFile);
        }
    }
}
?>

const PADDING_BLOCK_SIZE = 16;
const ALGORITHM = "aes-256-cbc";
const PLAINTEXT_DATA_KEY = "poSENHhkGVG/4fEHvhRO6j9W3goETWZAg+ZgTWxhw34=";
const IV = "X1bIRjgIoDn/BDFhHIbg7g=="; // randombytes(16) converted to base64
const CHUNK_SIZE = 16 * 1024;

class Cipher {
    private pkcs7Pad(buffer: Buffer, blockSize: number = PADDING_BLOCK_SIZE): Buffer {
      const padding = blockSize - (buffer.length % blockSize);
      const padBuffer = Buffer.alloc(padding, padding);
      return Buffer.concat([buffer, padBuffer]);
    }
  
    async encrypt(source: string, dest: string) {
      return new Promise(async (res, rej) => {
        const iv = base64ToBuffer(IV);
  
        const cipher = createCipheriv(ALGORITHM, base64ToUint8Array(PLAINTEXT_DATA_KEY), iv);
        cipher.setAutoPadding(false);
  
        const readStream = createReadStream(source, { highWaterMark: CHUNK_SIZE });
        const writeStream = createWriteStream(dest, { highWaterMark: CHUNK_SIZE });
  
        writeStream.write(iv);
  
        let tempChunkStorage = Buffer.alloc(0); // Buffer to store remaining data
        readStream.on(DATA_EVENT, (chunk) => {
          if (typeof chunk === "string") {
            chunk = Buffer.from(chunk);
          }
  
          // Append the new chunk to the temp storage
          tempChunkStorage = Buffer.concat([tempChunkStorage, chunk]);
  
          while (tempChunkStorage.length >= CHUNK_SIZE) {
            const block = tempChunkStorage.subarray(0, CHUNK_SIZE);
            const encryptedBuffer = cipher.update(block);
            writeStream.write(encryptedBuffer);
            tempChunkStorage = tempChunkStorage.subarray(CHUNK_SIZE);
          }
        });
        readStream.on("end", () => {
          if (tempChunkStorage.length > 0) {
            const encryptedBuffer = cipher.update(this.pkcs7Pad(tempChunkStorage)); // Add padding
            writeStream.write(encryptedBuffer);
            cipher.final();
          }
          writeStream.end();
          res(true);
        });
        readStream.on("error", (err) => {
          writeStream.close();
          rej(err);
        });
      });
    }
  }
First 50 characters of the cipher (base64) in PHP:  0tCb9xtx5KpG+56ukYvcQDoNKCdoPtAFUrFDRc4TiqQrQocQRK

First 50 characters of the cipher (base64) in Node: sUUI4nXHwhKNdRs+Brqc5neKuKb3fx4qqBohlDSn/7FVrYo46/

output


Solution

  • There are several problems in both codes.

    1. The Base64 decoding of the key is missing in the PHP code.

    2. You should also change the encryption in the PHP code so that the same ciphertext is produced as with an encryption that encrypts the entire plaintext at once. With this change, any chunk size can be used (as long as it is an integer multiple of the block size).

      This not only leads to a decoupling of encryption and decryption, but also simplifies the NodeJS implementation, as will be explained later.

      To achieve this, the following changes are required for CBC/PKCS#7 padding:

      • The default PKCS#7 padding must be disabled for all chunks except the last chunk.
      • The last ciphertext block of the n-th ciphertext chunk must be used as IV of the n+1-th ciphertext chunk.
    3. In addition, inconsistencies and inefficiencies should be eliminated (which also makes it easier to implement the above changes):

      • You are using the OPENSSL_NO_PADDING flag (value: 3) in the PHP code, which is actually applied in the context of asymmetric encryption. The value of this flag corresponds to the bitwise OR-ing of the flags OPENSSL_RAW_DATA (value: 1) and OPENSSL_ZERO_PADDING (value: 2), which are intended for the context of symmetric encryption, i.e. the default Base64 encoding is disabled as well as the default PKCS#7 padding.
      • PHP/OpenSSL uses PKCS#7 padding by default. A custom implementation is not required.

    Overall, the described changes in the PHP code can be implemented as follows:

    ...
    $key = base64_decode(PLAINTEXT_DATA_KEY); // Base64 decode key
    $iv = base64_decode(IV);
    
    $inputFile = fopen($source, 'rb');
    $outputFile = fopen($dest, 'wb');
    
    fwrite($outputFile, $iv); // write initial IV
    $options = OPENSSL_RAW_DATA | OPENSSL_ZERO_PADDING; // disable Base64 encoding and padding
    while (!feof($inputFile)) {
        $buffer = fread($inputFile, CHUNK_SIZE); // CHUNK_SIZE must be an integer multiple of blocksize (16 bytes for AES)
        if (feof($inputFile)) {
            $options = OPENSSL_RAW_DATA; // enable padding for the last chunk
        }
        $cipherText = openssl_encrypt($buffer, ALGORITHM, $key, $options, $iv); 
        $iv = substr($cipherText, -16); // determine IV for the next chunk
        fwrite($outputFile, $cipherText); // write ciphertext chunk
    }
    
    fclose($inputFile);
    fclose($outputFile);
    ...
    

    As already mentioned above, one advantage of the changes made is the independence of the ciphertext from the chunk size.
    This allows the chunk size to be handled internally on the NodeJS side, which significantly shortens the encryption code:

    ...
    var key = Buffer.from(PLAINTEXT_DATA_KEY, 'base64');
    var iv = Buffer.from(IV, 'base64');
    
    var readStream = fs.createReadStream(pathPlaintextFile);
    var writeStream = fs.createWriteStream(pathCiphertextFile);
    
    writeStream.write(iv); // write IV
    var cipher = crypto.createCipheriv('aes-256-cbc', key, iv);
    readStream.pipe(cipher).pipe(writeStream); // write ciphertext chunk
    ...
    

    With these changes, both sides produce identical ciphertexts (for identical input data).


    Security:
    In the event that the static IV is not only used for test purposes, it should be noted that the reuse of key/IV pairs is a vulnerability.
    Therefore, for a fixed key no static IV should be used, but instead a random IV should be generated for each encryption.