Search code examples
node.jsmongodbhashmongoosemd5

Adding md5 hash value to mongo collection


Issue: I currently have a mongo collection with 100,000 documents. Each document has 3 fields (_id, name, age). I want to add a 4th field to each document called hashValue that stores the md5 hash value of each documents name field.

I currently can interact with my collection via the mongo shell or via Mongoose ODM as part of a nodeJS app.

Possible Solutions:

  1. Use Mongoose/nodeJs:

I realize this won't work (don't believe you can iterate through a cursor in this manner), but hopefully it shows what I'm trying to do.

var crypto = require('crypto');

    MyCollection.find().forEach(function(el){
        var hash = crypto.createHash('md5').update(el.name).digest("hex");
        el.name = hash;
        el.save()
    });
  1. Use mongo Shell - Almost same as above, and I realize something like the above syntax would work. Only issue is that I don't know how to create the md5 hash in the mongo shell. But I am able to iterate through each document and add a field.

  2. (possible workaround) - The goal of this is to be able to query based off the md5 hash of a name value. I believe mongo allows you to create a hashed index (link here). Only issue is that I can't find an example of anyone using this for querying (only seems to be used for sharding) and I'm not sure if that will work later on. (Example: I want to md5 hash a name I collect from a user, and then query my mongo collection to see if I can find that md5 hash in the hashValue field)


Solution

  • Javascript already has md5 hash function called hex_md5. Its available in mongo console as well.

    > hex_md5('john')
    527bd5b5d689e2c32ae974c6229ff785
    

    So to update records in your case you can use the following code snippet in mongo console:

    db.collection.find().forEach( function(data){
      data.hashValue = hex_md5(data.name);
      db.collection.save(data);
    });