Twofish known answer test

I'm looking into using twofish to encrypt strings of data. Before trusting my precious data to an unknown library I wish to verify that it agrees with the known answer tests published on Bruce Schneier's website.

To my dismay I tried three twofish implementations and found none that agree with the KAT. This leads me to believe that I'm doing something wrong and I'm wondering if someone could tell me what it is.

I've made sure the mode is the same (CBC), the key length is the same (128bits) and the iv/key/pt values are the same. Is there an additional parameter in play for twofish encryption?

Here are the first two test entries from CBC_E_M.txt from the KAT archive:

I=0
KEY=00000000000000000000000000000000
IV=00000000000000000000000000000000
PT=00000000000000000000000000000000
CT=3CC3B181E1495D0495D652B66921DA0F

I=1
KEY=3CC3B181E1495D0495D652B66921DA0F
IV=3CC3B181E1495D0495D652B66921DA0F
PT=BE938D30FAB43B71F2E114E9C0529299
CT=695250B109C6F71D410AC38B0BBDA3D2

I interpret these to be in hex, therefore 16bytes=128bits long.

I tried using the following twofish implementations:

ruby: https://github.com/mcarpenter/twofish.rb
JS: https://github.com/ryanofsky/twofish/
online: http://twofish.online-domain-tools.com/

All three give the same CT for the first test, namely (hex encoded)

9f589f5cf6122c32b6bfec2f2ae8c35a

So far so good, except it does not agree with CT0 in the KAT...

For the second test the ruby library and the online tool give:

f84268f0293adf4d24e27194911a24c

While the js library gives:

fd803b310bb5388ddb76d5faf9e23dbe

And neither of these agrees with CT1 in the KAT.

Am I doing something wrong here? Any help greatly appreciated.

The online tool is easy to use, just be sure to select HEX for the key and input text. Here is the ruby code I used to generate these values (it's necessary to check out each library for this to work):

def twofish_encrypt(iv_hex, key_hex, data_hex)
  iv = iv_hex.gsub(/ /, "").scan(/../).map { |x| x.hex.chr }.join
  key = key_hex.gsub(/ /, "").scan(/../).map { |x| x.hex.chr }.join
  data = data_hex.gsub(/ /, "").scan(/../).map { |x| x.hex.chr }.join

  tf = Twofish.new(key, :mode => :cbc, :padding => :none)
  tf.iv = iv
  enc_data = tf.encrypt(data)
  enc_data.each_byte.map { |b| b.to_s(16) }.join
end

ct0 = twofish_encrypt("00000000000000000000000000000000",
                      "00000000000000000000000000000000",
                      "00000000000000000000000000000000")
puts "ct0: #{ct0}"
ct1 = twofish_encrypt("3CC3B181E1495D0495D652B66921DA0F",
                      "3CC3B181E1495D0495D652B66921DA0F",
                      "BE938D30FAB43B71F2E114E9C0529299")
puts "ct1: #{ct1}"

function twofish_encrypt(iv_hex, key_hex, data_hex) {
    var iv = new BinData()                             
    iv.setHexNibbles(iv_hex)
    iv.setlength(16*8)
    binkey = new BinData()
    binkey.setHexNibbles(key_hex)
    binkey.setlength(16*8)
    key = new TwoFish.Key(binkey);

    data = new BinData()
    data.setHexNibbles(data_hex)
    data.setlength(16*8)

    cipher = new TwoFish.Cipher(TwoFish.MODE_CBC, iv);
    enc_data = TwoFish.Encrypt(cipher, key, data);

    return enc_data.getHexNibbles(32);
}

var ct0 = twofish_encrypt("00000000000000000000000000000000",
                          "00000000000000000000000000000000",
                          "00000000000000000000000000000000");
console.log("ct0: " + ct0);

var ct1 = twofish_encrypt("3CC3B181E1495D0495D652B66921DA0F",
                          "3CC3B181E1495D0495D652B66921DA0F",
                          "BE938D30FAB43B71F2E114E9C0529299");
console.log("ct1: " + ct1);

Solution

The header of the CBC_E_M.txt file reads:

Cipher Block Chaining (CBC) Mode - ENCRYPTION
Monte Carlo Test

The confusion can be explained by this description; from the NIST description of the Monte Carlo Tests:

Each Monte Carlo Test consists of four million cycles through the candidate algorithm implementation. These cycles are divided into four hundred groups of 10,000 iterations each. Each iteration consists of processing an input block through the candidate algorithm, resulting in an output block. At the 10,000th cycle in an iteration, new values are assigned to the variables needed for the next iteration. The results of each 10,000th encryption or decryption cycle are recorded and included by the submitter in the appropriate file.

So what you get in the text file is 400 results, each representing 10,000 iterations where each input of an iteration depends on the output of the previous iterations. This is obviously not the same as a single encryption. Monte Carlo tests are basically performing many tests using randomized input; in this case a high number of block cipher encrypts are used to perform the randomization.

To test if your CBC code is correct, just use any of the other test vectors (not the Monte Carlo ones) and assume an all zero IV. In that case a single block (ECB) encrypt has the identical outcome of CBC mode. This also works for the ever more popular CTR mode.

The initial 9f589f5cf6122c32b6bfec2f2ae8c35a value that you found is correct for a 128 bit all zero key, IV and plaintext. The f84268f0293adf4d24e27194911a24c value is correct as well.

There is certainly something wrong with your hex encoder, your result is even not of the correct size for that value (what happens with leading zero's of the hex encodings?). Given the results and code, I would definitely take a look at your encoding / decoding functions.