Search code examples
javaencryptionaeslibreofficeods

Encrypt a file inside an ODS archive


I'm trying to reproduce the LibreOffice encryption of a file inside an ODS (open document spreadsheet) archive. See http://docs.oasis-open.org/office/v1.2/os/OpenDocument-v1.2-os-part3.html#__RefHeading__752811_826425813 for technical information.

The best summary I found is in wikipedia:

When OpenDocument file is password protected the file structure of the bundle remains the same, but contents of XML files in the package are encrypted using following algorithm:

  1. The file contents are compressed with the DEFLATE algorithm.
  2. A checksum of a portion of the compressed file is computed (SHA-1 of the file contents, or SHA-1 of the first 1024 bytes of the file, or SHA-256 of the first 1024 bytes of the file) and stored so password correctness can be verified when decrypting.
  3. A digest (hash) of the user entered password in UTF-8 encoding is created and passed to the package component. ODF versions 1.0 and 1.1 only mandate support for the SHA-1 digest here, while version 1.2 recommends SHA-256.
  4. This digest is used to produce a derived key by undergoing key stretching with PBKDF2 using HMAC-SHA-1 with a salt of arbitrary length (in ODF 1.2 – it's 16 bytes in ODF 1.1 and below) generated by the random number generator for an arbitrary iteration count (1024 by default in ODF 1.2).
  5. The random number generator is used to generate a random initialization vector for each file.
  6. The initialization vector and derived key are used to encrypt the compressed file contents. ODF 1.0 and 1.1 use Blowfish in 8-bit cipher feedback mode, while ODF 1.2 considers it a legacy algorithm and allows Triple DES and AES (with 128, 196 or 256 bits), both in cipher block chaining mode, to be used instead.

My un-encrypted module content (encoding: utf-8, line break: LF) is:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE script:module PUBLIC "-//OpenOffice.org//DTD OfficeDocument 1.0//EN" "module.dtd">
<script:module xmlns:script="http://openoffice.org/2000/script" script:name="Module1" script:language="StarBasic" script:moduleType="normal">REM  *****  BASIC  *****
REM Hello, world!
</script:module>

The encrypted module content produced by LibreOffice and stored in the ODS archive, is (in hexadecimal):

[a3, f4, 61, 98, c1, c8, e8, b1, d3, fa, b0, bc, ef, 51, 87, da, 4c, d8, 92, c2, 09, 7f, 12, 19, 47, 44, af, 3b, 32, 9d, 4a, 33, eb, ab, c0, 45, 97, 00, 27, 60, cf, b3, 49, 55, 76, 46, e2, 3c, 35, a0, a7, a9, 8a, af, a3, cd, 3c, f3, 20, 5f, 83, 89, a4, 9c, d9, b5, a6, f5, db, 68, 0a, b4, d0, 15, 3e, 6d, af, c6, 16, 78, 29, 79, 42, cb, 56, e3, b1, cd, c1, a6, a0, 13, 91, 16, e3, 89, a8, c6, d4, 69, e8, ea, 87, e9, 9d, 09, bb, 03, a0, 6e, a0, 29, 37, 85, 9a, 59, fb, 47, 3a, 72, 1d, 85, 25, b0, 92, 37, 55, a4, eb, de, 03, eb, de, e1, b6, f3, f9, 7b, 3a, 09, 2c, ad, 8e, ff, 1e, a2, 79, 63, 12, 04, 93, 67, 3d, 59, 6c, e8, aa, ae, 37, 7e, 66, cf, 99, 54, 63, a5, ea, 31, 78, 44, b1, 54, be, 5a, af, 3f, 0d, bf, b5, ce, 98, c8, 7a, 44, 61, d4, 76, 69, 3b, 01, 6f, 27, ab, 5f, a2, b0, 98, 32, 52, 0c, 9c, 08, 0c, 6a, 0c, 54, e0, 83, dc, d0, ad, 3a, 0f, 0f, 75, 6f, e6, 0d, db, db, 50, a4, 2b, d3, 5f, 43, 7c, 2d, 16, fa, 87, 62, 09, f6, d2, 28, 31, b5, a0, be]

And here's the relevant part of the manifest produced by LibreOffice:

    <manifest:file-entry manifest:full-path="Basic/Test/Module1.xml" manifest:media-type="text/xml" manifest:size="332">
     <manifest:encryption-data manifest:checksum-type="urn:oasis:names:tc:opendocument:xmlns:manifest:1.0#sha256-1k" manifest:checksum="/UdU2OKZn04r0e9O047PaWNqi7LGaHYN9mURmvMCM60=">
      <manifest:algorithm manifest:algorithm-name="http://www.w3.org/2001/04/xmlenc#aes256-cbc" manifest:initialisation-vector="ZEk8JHG3bHu8kZw0VGOT+g=="/>
      <manifest:key-derivation manifest:key-derivation-name="PBKDF2" manifest:key-size="32" manifest:iteration-count="100000" manifest:salt="jGIagiBnlFdvQctdCkYfRQ=="/>
      <manifest:start-key-generation manifest:start-key-generation-name="http://www.w3.org/2000/09/xmldsig#sha256" manifest:key-size="32"/>
     </manifest:encryption-data>
    </manifest:file-entry>

The password was 123.


Here's my code:

// imports
// needs a dependency on `org.bouncycastle/bcprov-jdk15on/1.65`

public class EncryptMacro {
    public static void main(String[] args)
            throws IOException, NoSuchAlgorithmException, InvalidKeySpecException,
            IllegalBlockSizeException, InvalidKeyException, BadPaddingException,
            InvalidAlgorithmParameterException, NoSuchPaddingException {
        new EncryptMacro().encryptAsLO();
    }

    public void encryptAsLO() throws IOException, NoSuchAlgorithmException,
            InvalidKeySpecException, NoSuchPaddingException, InvalidAlgorithmParameterException,
            InvalidKeyException, BadPaddingException, IllegalBlockSizeException {
        // needs a dependency on `org.bouncycastle/bcprov-jdk15on/1.65`
        Security.addProvider(new BouncyCastleProvider());

        // copy the manifest parameters
        int plainSize = 332;
        byte[] checksum = Base64.decode("/UdU2OKZn04r0e9O047PaWNqi7LGaHYN9mURmvMCM60=");
        byte[] iv = Base64.decode("ZEk8JHG3bHu8kZw0VGOT+g==");
        int iterationCount = 100000;
        byte[] salt = Base64.decode("jGIagiBnlFdvQctdCkYfRQ==");
        int startKeySize = 32;
        int keySize = 32;

        // password
        String password = "123"; // that's for testing purpose!

        String plainText =
                "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<!DOCTYPE script:module PUBLIC \"-//OpenOffice.org//DTD OfficeDocument 1.0//EN\" \"module.dtd\">\n<script:module xmlns:script=\"http://openoffice.org/2000/script\" script:name=\"Module1\" script:language=\"StarBasic\" script:moduleType=\"normal\">REM  *****  BASIC  *****\nREM Hello, world!\n</script:module>";
        byte[] encrypted =
                new byte[]{-93, -12, 97, -104, -63, -56, -24, -79, -45, -6, -80, -68, -17, 81, -121,
                        -38, 76, -40, -110, -62, 9, 127, 18, 25, 71, 68, -81, 59, 50, -99, 74, 51,
                        -21, -85, -64, 69, -105, 0, 39, 96, -49, -77, 73, 85, 118, 70, -30, 60, 53,
                        -96, -89, -87, -118, -81, -93, -51, 60, -13, 32, 95, -125, -119, -92, -100,
                        -39, -75, -90, -11, -37, 104, 10, -76, -48, 21, 62, 109, -81, -58, 22, 120,
                        41, 121, 66, -53, 86, -29, -79, -51, -63, -90, -96, 19, -111, 22, -29, -119,
                        -88, -58, -44, 105, -24, -22, -121, -23, -99, 9, -69, 3, -96, 110, -96, 41,
                        55, -123, -102, 89, -5, 71, 58, 114, 29, -123, 37, -80, -110, 55, 85, -92,
                        -21, -34, 3, -21, -34, -31, -74, -13, -7, 123, 58, 9, 44, -83, -114, -1, 30,
                        -94, 121, 99, 18, 4, -109, 103, 61, 89, 108, -24, -86, -82, 55, 126, 102,
                        -49, -103, 84, 99, -91, -22, 49, 120, 68, -79, 84, -66, 90, -81, 63, 13,
                        -65, -75, -50, -104, -56, 122, 68, 97, -44, 118, 105, 59, 1, 111, 39, -85,
                        95, -94, -80, -104, 50, 82, 12, -100, 8, 12, 106, 12, 84, -32, -125, -36,
                        -48, -83, 58, 15, 15, 117, 111, -26, 13, -37, -37, 80, -92, 43, -45, 95, 67,
                        124, 45, 22, -6, -121, 98, 9, -10, -46, 40, 49, -75, -96, -66};

        // check the plain text size
        byte[] source = plainText.getBytes(StandardCharsets.UTF_8);
        this.check("Plain size", plainSize == source.length);
        // deflate the content (see 1. above)
        byte[] deflated = this.deflate(source);
        // and check the checksum (see 2. above)
        this.check("Deflated hash", Arrays.equals(checksum, this.getSha256_1k(deflated)));

        // hash the password (see 3. above)
        byte[] hashedPassword = this.getSha256_1k(password.getBytes(StandardCharsets.UTF_8));
        char[] chars = new char[hashedPassword.length];
        for (int i = 0; i < hashedPassword.length; i++) {
            chars[i] = (char) hashedPassword[i];
        }
        this.check("Start key size", chars.length == startKeySize);
        // or:
        // char[] chars = password.toCharArray();

        // get the key (see 4. above)
        SecretKeyFactory factory = SecretKeyFactory.getInstance("PBKDF2WithHmacSHA256");
        KeySpec keySpec = new PBEKeySpec(chars, salt, iterationCount, keySize * 8);
        SecretKey s = factory.generateSecret(keySpec);
        Key key = new SecretKeySpec(s.getEncoded(), "AES");

        // encrypt the data (see 6. above)
        Cipher cipher = Cipher.getInstance("AES/CBC/PKCS7Padding");
        // or:
        // Cipher cipher = Cipher.getInstance("AES/CBC/ISO10126Padding"); // W3C padding
        cipher.init(Cipher.ENCRYPT_MODE, key, new IvParameterSpec(iv));
        byte[] result = cipher.doFinal(deflated);

        this.check("Encrypted", Arrays.equals(encrypted, result));
    }

    private byte[] deflate(byte[] data) throws IOException {
        InputStream is = new ByteArrayInputStream(data);
        ByteArrayOutputStream os = new ByteArrayOutputStream();
        final byte[] buffer = new byte[16];  // for testing purpose

        Deflater deflater = new Deflater(Deflater.BEST_COMPRESSION, true);
        DeflaterOutputStream dos = new DeflaterOutputStream(os, deflater);
        int count = is.read(buffer);
        while (count != -1) {
            dos.write(buffer, 0, count);
            count = is.read(buffer);
        }
        dos.close();
        return os.toByteArray();
    }

    private byte[] getSha256_1k(byte[] data) throws NoSuchAlgorithmException {
        MessageDigest digest = MessageDigest.getInstance("SHA-256");
        digest.update(data, 0, Math.min(data.length, 1024));
        return digest.digest();
    }

    private void check(String text, boolean test) {
        if (test) {
            System.out.println(text + " ok");
        } else {
            System.out.println(text + " NOT ok");
            System.exit(1);
        }
    }
}

The output is:

Plain size ok
Deflated hash ok
Start key size ok
Encrypted NOT ok

Of course, I would like the generated encrypted data to be identical to the one present in the ODS archive. I tried to change the padding, the key derivation function, to pass the password directly to PBEKeySpec, triple-checked the password, etc. without success. I also had a look to the source code of LibreOffice (https://github.com/LibreOffice/core/tree/master/oox/source/crypto), but did not manage to find what is wrong in my code. (If that matters, I used LibreOffice Calc Version: 6.0.7.3 on Ubuntu 18.04.10 and Java 8.)

My question is: where is my mistake and how do I fix it?


Solution

  • There are three issues in your code:

    1. According to the specification PBKDF2 is used with HMAC-SHA1 (and not HMAC-SHA256), s. 3.4.2 Encryption Process
    2. The key s derived with PBKDF2WithHmacSHA256 is an instance of PBKDF2KeyImpl, which requires a UTF8 string as password (see docs of the PBKDF2KeyImpl class). Here, however, the password is a hash, which is generally not compatible with UTF8. A possible solution is to replace PBEKeySpec with BouncyCastle's PKCS5S2ParametersGenerator, which expects the password as byte array (in init). For this solution replace

      SecretKeyFactory factory = SecretKeyFactory.getInstance("PBKDF2WithHmacSHA256");
      KeySpec keySpec = new PBEKeySpec(chars, salt, iterationCount, keySize * 8);
      SecretKey s = factory.generateSecret(keySpec);
      Key key = new SecretKeySpec(s.getEncoded(), "AES");
      

      with

      PBEParametersGenerator generator = new PKCS5S2ParametersGenerator(new SHA1Digest()); 
      generator.init(hashedPassword, salt, iterationCount);
      KeyParameter keyParam = (KeyParameter)generator.generateDerivedParameters(keySize * 8);
      Key key = new SecretKeySpec(keyParam.getKey(), "AES");
      
    3. The padding used is ISO10126Padding, so AES/CBC/PKCS7Padding must by replaced by AES/CBC/ISO10126Padding. The easiest way to verify this is to decrypt the target ciphertext (encrypted) without removing the padding (AES/CBC/NoPadding). The last block is 06230276DDC67229EB31E830A1D7500F, which complies with ISO10126Padding. For ISO10126Padding, the last byte specifies the number of padding bytes, which (apart from the last byte) consist of random values. So in this case the last 15 bytes are padding bytes.

      ISO10126Padding is also the reason why a comparison of the ciphertext on byte level with

      this.check("Encrypted", Arrays.equals(encrypted, result));
      

      fails. When comparing the ciphertext, the padded block must therefore not be taken into account.