Search code examples
javapythoncharacter-encodingcryptographypbkdf2

Java byterray to string must be equal python bytearray string when generate secret with SecretKeyFactory


I have a task to rewrite some python crypto code to java. I'm new in python. Python code:

from cryptography.hazmat.primitives.kdf.pbkdf2 import PBKDF2HMAC
from cryptography.hazmat.backends import default_backend
backend = default_backend()  



PASSWORD = bytes((1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16))

key = PBKDF2HMAC(hashes.SHA256(), 32, salt, iterations, backend).derive(PASSWORD)

My java implementation:

import javax.crypto.SecretKeyFactory;
    import javax.crypto.spec.PBEKeySpec;
     byte[] PASSWORD = new byte[]{1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16};
        SecretKeyFactory factory = SecretKeyFactory.getInstance("PBKDF2WithHmacSHA256");
    SecretKey tmp = factory.generateSecret(new PBEKeySpec(new String(PASSWORD).toCharArray(), salt, iterations, 256));
    byte[] key = tmp.getEncoded();

As you can see PASSWORD is a byte array which I get from a hex string, i.e 010203....0F10, I can't change it (i.e. can't specify it as a string in python realization, as I understand server transforms PASSWORD to byte array also). All worked fine with this dummy PASSWORD i.e. keys generated by python and java code were equal. But I'm faced with a problem when the password changed to arbitrary, i.e. for example AFFFFFFFFDBGEHTH.... As I understand the problem with java byte array representation as signed integers. I.e. when I convert hex "FFFAAABBBCCCDDDDFFAAAAAAAAAAAABB" for example to byte array it will be byte array [-1, -6, -86, -69, -68, -52, -35, -35, -1, -86, -86, -86, -86, -86, -86, -69], but in python it will [255, 250, 170, 187, 188, 204, 221, 221, 255, 170, 170, 170, 170, 170, 170, 187]. Then when I convert java byte array to charArray for PBEKeySpec constructor - new PBEKeySpec(new String(new byte[]{-1, -6, -86, -69, -68, -52, -35, -35, -1, -86, -86, -86, -86, -86, -86, -69}).toCharArray()... it works as unexpected.

How I have to change my java code to receive the same key as in python? As I understand I have to encode java byte array string to the same value as in python .derive(...) method. Thanks in advance.

UPDATE:

salt       = b'salt'
PASSWORD = = bytes((255, 250, 170, 187, 188, 204, 221, 221, 255, 170, 170, 170, 170, 170, 170, 187))
key = PBKDF2HMAC(hashes.SHA256(), 32, salt, 512, backend).derive(PASSWORD)

and

SecretKeyFactory factory = SecretKeyFactory.getInstance("PBKDF2WithHmacSHA256");
password = new String(new byte[]{-1, -6, -86, -69, -68, -52, -35, -35, -1, -86, -86, -86, -86, -86, -86, -69});
var key = secretKeyFactory
                    .generateSecret(new PBEKeySpec(password.toCharArray(), 
"salt".getBytes(), 512, 256))
                    .getEncoded();

should give the same result. It works for new byte[]{1,2,3,4,....16} password.

UPDATE2: I changed password to unsigned int[] but it not works anyway:

    char[] password = new char[PASSWORD.length];
            for (int i = 0; i<PASSWORD.length; password[i] = (char)(PASSWORD[i++] & 0xFF));
    var key = secretKeyFactory
                    .generateSecret(new PBEKeySpec(password, "salt".getBytes(), 512, 256))
                    .getEncoded();
    

Solution

  • Apart from the different digests (s. 1st answer), the problem is that the key derived with PBKDF2WithHmacSHA256 is an instance of PBKDF2KeyImpl, which requires a string as password. This string is UTF8 encoded in PBKDF2KeyImpl (see documentation of the class PBKDF2KeyImpl). Here, however, the password is an (arbitrary) byte sequence, which is generally not compatible with UTF8, so that the data is corrupted during UTF8 decoding. A possible solution is to replace PBEKeySpec with BouncyCastle's PKCS5S2ParametersGenerator, which expects the password as byte array (in init):

    import java.nio.charset.StandardCharsets;
    import org.bouncycastle.crypto.PBEParametersGenerator;
    import org.bouncycastle.crypto.digests.SHA256Digest;
    import org.bouncycastle.crypto.generators.PKCS5S2ParametersGenerator;
    import org.bouncycastle.crypto.params.KeyParameter;
    ...
    byte[] salt = "salt".getBytes(StandardCharsets.UTF_8);
    int iterations = 512;
    byte[] PASSWORD = new byte[] { (byte)255, (byte)250, (byte)170, (byte)187, (byte)188, (byte)204, (byte)221, (byte)221, (byte)255, (byte)170, (byte)170, (byte)170, (byte)170, (byte)170, (byte)170, (byte)187 };
    PBEParametersGenerator generator = new PKCS5S2ParametersGenerator(new SHA256Digest());
    generator.init(PASSWORD, salt, iterations);
    byte[] keyBytes = ((KeyParameter)generator.generateDerivedParameters(256)).getKey(); 
    // with bytesToHex from https://stackoverflow.com/a/9855338
    System.out.println(bytesToHex(keyBytes).toLowerCase());  // d8aa4772e9648572611fe6dca7f653353de934cdb3b29fab94eb13ba2b198b9f
    

    The result now matches that of the Python code:

    from cryptography.hazmat.primitives.kdf.pbkdf2 import PBKDF2HMAC
    from cryptography.hazmat.primitives import hashes
    
    salt = b'salt'
    iterations = 512
    PASSWORD = bytes((255, 250, 170, 187, 188, 204, 221, 221, 255, 170, 170, 170, 170, 170, 170, 187))
    key = PBKDF2HMAC(hashes.SHA256(), 32, salt, iterations).derive(PASSWORD)
    
    print(key.hex()) # d8aa4772e9648572611fe6dca7f653353de934cdb3b29fab94eb13ba2b198b9f