Search code examples
node.jsfilehashsha256cryptojs

Why is the hash of this file staying the same when I change its data?


I'm trying to hash a file in JS with this piece of code:

var CryptoJS = require("crypto-js");
var fs = require('fs');

fs.readFile('./file.txt', function(err,data){
     if(err) {
         console.error("Could not open file: %s", err);
         process.exit(1);
     }
         console.log("HASH: " + CryptoJS.SHA256(data));
});

No matter what I write into the .txt, the hash produced always is: 4ea5c508a6566e76240543f8feb06fd457777be39549c4016436afda65d2330e

If I put some string data into CryptoJS.SHA256("text_exemple") the hash works correctly.

What am I missing here?


Solution

  • I was curious why this wasn't working. First, let's explain what is actually happening. You don't include a file encoding type whenever you call readFile. Per readFile docs:

    If no encoding is specified, then the raw buffer is returned.

    That's fine. You would think that a buffer of a file would be just as hashable as the actual data in a file. However, what is happening is that the crypto library doesn't account for receiving a Buffer, or rather, it only accounts for Strings. You can see that in the source of the library here: core.js:512 where it does a typeof data === 'string' check.

    Since typeof a_buffer === "string" evaluates to false, the hash is never updated. Because of that, you get the same hash every time.

    So, the solution is to just provide an encoding:

    fs.readFile('./file.txt', "utf8", function(err,data){...}
    

    or, do some operation to turn the Buffer into a String in such a way that you get the actual data, such as data.toString("utf8") where data is the Buffer from readFile.