Search code examples
javascripthashcryptojsarraybuffer

different sha256 hash value of file vs. file contents as WordArray


I made a test text file, the contents are: aaaaaaaabbbbbbbbccccccccddddddddeeeeeeeeffffffffgggggggghhhhhhhh

The sha256 hex digest for this value as a string is: 75eef9680de51f6f70291057e9afc5975470960dfec5f37f83db69aa625786e5

I get this same value when hashing it in python using hashlib, js using Crypto.js, or using ssl on the file from the command line.

however, in js, when I read the file in like this:

var fr = new FileReader();
fr.readAsArrayBuffer(file);
console.log(fr.result.byteLength); // it's 64...
var input = CryptoJS.lib.WordArray.create(new Uint8Array(fr.result));
CryptoJS.SHA256(input).toString();

I get: 8f76bf13468fb12ac4e59610adff70fd10282e8494a2749db4677f81e2c6e998

UPDATE: from the crypto.js docs:

/**
 * cryptojs use WordArray (CryptoJS.lib.WordArray) as parameter/result frequently.
 *    A WordArray object represents an array of 32-bit words. When you pass a string,
 * it's automatically converted to a WordArray encoded as UTF-8.
 */

suspecting it could be a utf-8 vs ascii thing, or something like that, but have no idea how to check.


Solution

  • I found asmCrypto.js which accepts both ArrayBuffers and Uint8Arrays as input - I'm now getting the expected result (It's also pretty fast). I use it like this:

    var fr = new FileReader();
    fr.readAsArrayBuffer(file);
    console.log(fr.result.byteLength); // it's still 64...
    asmCrypto.SHA256.hex(new Uint8Array(fr.result));
    asmCrypto.SHA256.hex(fr.result); // this also works