Search code examples
javabase64apache-commonsencryption

Why does decryption function return garbage codes?


I have translate a C# based decrypt function into Java. It works fine and could be used to decrypted the passwords which have been encrypted by C# program. Here is the source code:

import org.apache.commons.codec.binary.Base64;

import javax.crypto.Cipher;
import javax.crypto.spec.IvParameterSpec;
import javax.crypto.spec.SecretKeySpec;
import java.security.Key;

public class TestDecrpt {
    public static void main(String[] args) throws Exception {
        String data = "encrypted data";
        String sEncryptionKey = "encryption key";
        byte[] rawData = new Base64().decode(data);
        byte[] salt = new byte[8];
        System.arraycopy(rawData, 0, salt, 0, salt.length);

        Rfc2898DeriveBytes keyGen = new Rfc2898DeriveBytes(sEncryptionKey, salt);

        byte[] IV = keyGen.getBytes(128 / 8);
        byte[] keyByte = keyGen.getBytes(256 / 8);

        Key key = new SecretKeySpec(keyByte, "AES");
        Cipher cipher = Cipher.getInstance("AES/CBC/PKCS5Padding");
        cipher.init(Cipher.DECRYPT_MODE, key, new IvParameterSpec(IV));
        int pureDataLength = rawData.length - 8;
        byte[] pureData = new byte[pureDataLength];
        System.arraycopy(rawData, 8, pureData, 0, pureDataLength);
        String plaintext = new String(cipher.doFinal(pureData), "UTF-8").replaceAll("\u0000", "");
        System.out.println(plaintext);
    }
}

I follow its algorithm to write the encrypt function. And codes is:

import org.apache.commons.codec.binary.Base64;

import javax.crypto.Cipher;
import javax.crypto.spec.IvParameterSpec;
import javax.crypto.spec.SecretKeySpec;
import java.security.Key;
import java.security.SecureRandom;


public class testEncrypt {
    public static void main(String[] args) throws Exception {
        String data = "Welcome2012~1@Welcome2012~1@Welcome2012~1@Welcome2012~1@Welcome2012~1@";
        String sEncryptionKey = "encryption key"; # the same key
        byte[] rawData = new Base64().decode(data);
        SecureRandom random = new SecureRandom();
        byte[] salt = new byte[8];
        random.nextBytes(salt);
        Rfc2898DeriveBytes keyGen = new Rfc2898DeriveBytes(sEncryptionKey, salt);

        byte[] IV = keyGen.getBytes(128 / 8);
        byte[] keyByte = keyGen.getBytes(256 / 8);

        Key key = new SecretKeySpec(keyByte, "AES");
        Cipher cipher = Cipher.getInstance("AES/CBC/PKCS5Padding");
        cipher.init(Cipher.ENCRYPT_MODE, key, new IvParameterSpec(IV));
        byte[] out2 = cipher.doFinal(rawData);

        byte[] out = new byte[8 + out2.length];
        System.arraycopy(salt, 0, out, 0, 8);
        System.arraycopy(out2, 0, out, 8, out2.length);
        //String outStr=new String(out,"UTF-8");
        String outStr = new Base64().encodeToString(out);
        System.out.println(outStr);
        System.out.print(outStr.length());

    }
}

However, the encrypted data could not be decrypted correctly, it always return garbage codes, such as

ꉜ뙧巓妵峩枢펶땝ꉜ뙧巓妵峩枢펶땝ꉜ뙧巓�

Is there something wrong with the encrypt function?

================================================================================ [Update] After changing the code to

byte[] rawData = data.getBytes("UTF-8");

The data could be encrypted and decrypted successfully. However, the data which is encrypted in Java could not be correctly descrypted in C#. Here is the C# version decrypt function:

using System;
using System.IO;
using System.Security.Cryptography;
using System.Text;


namespace Test
{
    class Program
    {
        public static void Main(string[] args)
        {
                string data="EncryptedData";
                string sEncryptionKey="EncryptionKey";

                byte[] rawData = Convert.FromBase64String(data);
                byte[] salt = new byte[8];
                for (int i = 0; i < salt.Length; i++)
                    salt[i] = rawData[i];

                Rfc2898DeriveBytes keyGenerator = new Rfc2898DeriveBytes(sEncryptionKey, salt);
                Rijndael aes = Rijndael.Create();
                aes.IV = keyGenerator.GetBytes(aes.BlockSize / 8);
                aes.Key = keyGenerator.GetBytes(aes.KeySize / 8);

                using (MemoryStream memoryStream = new MemoryStream())
                using (CryptoStream cryptoStream = new CryptoStream(memoryStream, aes.CreateDecryptor(), CryptoStreamMode.Write))
                {
                    cryptoStream.Write(rawData, 8, rawData.Length - 8);
                    cryptoStream.Close();

                    byte[] decrypted = memoryStream.ToArray();
                    Console.Out.WriteLine(Encoding.Unicode.GetString(decrypted));
                    Console.In.ReadLine();
                }

        }
    }
}

I find that the original code are using "Unicode" as output format,

Encoding.Unicode.GetString(decrypted)

so I change my Java code to "Unicode".

For Decrypt in Java:

String plaintext = new String(cipher.doFinal(pureData), "Unicode");
System.out.println(plaintext);

For Encrypt in Java:

byte[] rawData = data.getBytes("Unicode");

But using the C# code to decrypt the data which has been encrypted by the Java program still meet garbage codes.

How could I fix this issue? Is there any magical trick?


[Last Update] After using "UTF-16LE" instead of "UTF-8", the issue has gone. It seems that "UTF-16LE" is the Java equivalent to the "Unicode" of C#.


Solution

  • This is the problem:

    String data = "Welcome2012~1@Welcome2012~1@Welcome2012~1@Welcome2012~1@Welcome2012~1@";
    byte[] rawData = new Base64().decode(data);
    

    That text is not meant to be base64-encoded binary data. It's just text. Why are you trying to decode it as base64 data?

    You want:

    byte[] rawData = data.getBytes("UTF-8");
    

    That way, when you later write:

    String plaintext = new String(cipher.doFinal(pureData), "UTF-8")
                                        .replaceAll("\u0000", "");
    

    you're doing the reverse action. (Admittedly you probably shouldn't need the replaceAll call, but that's a different matter.)

    For anything like this, you need to make sure that the steps you take on the way "out" are the reverse of the steps on the way "in". So in the correct code, you have:

    Unencrypted text data => unencrypted binary data (encode via UTF-8)
    Unencrypted binary data => encrypted binary data (encrypt with AES)
    Encrypted binary data => encrypted text data (encode with base64)
    

    So to reverse, we do:

    Encrypted text data => encrypted binary data (decode with base64)
    Encrypted binary data => unencrypted binary data (decrypt with AES)
    Unencrypted binary data => unencrypted text data (decode via UTF-8)