I have a javascript frontend which compares two object-hash
sha1 hashes in order to determine if an input has changed (in which case a processing pipeline needs to be reran).
I started building a python interface to interact with the same backend which uses hashlib
for the sha1 generation, but unfortunately the two functions return different hash values even though the inputs are the same.
I managed to produce the same hash values as hashlib
using crypto
, which means that the issue arises from object-hash
.
hashlib
import json
import hashlib
data = {
'key1': 'value1',
'key2': 'value2',
'key3': 'value3',
};
json_data = json.dumps(data, separators=(',', ':')).encode('utf-8')
hash = hashlib.sha1()
hash.update(json_data)
print(hash.hexdigest())
# outputs f692755b3c38bc6b0dc376d775db8b07d6d5f256
crypto
const crypto = require('crypto');
const data = {
key1: 'value1',
key2: 'value2',
key3: 'value3',
};
const stringData = JSON.stringify(data)
const shasum = crypto.createHash('sha1')
shasum.update(stringData)
console.log(shasum.digest('hex'));
// (same as hashlib) outputs f692755b3c38bc6b0dc376d775db8b07d6d5f256
object-hash (Tested with and without stringifying with no success)
const data = {
key1: 'value1',
key2: 'value2',
key3: 'value3',
};
const stringData = JSON.stringify(data)
const objectHash = require('object-hash');
console.log(objectHash.sha1(stringData));
// outputs b5b0a100d7852748fe2e35bf00eeb536ad2d17d1
I saw in object-hash
docs that the package is using crypto
so it doesn't make sense for the two outputs to be different.
How can I make object-hash
and hashlib
/crypto
all produce the same sha1 value?
It turns out that object-hash
prefixes the variable for hashing with its type. In the case of strings I needed to add string:{string_length}:
to the hash stream.
hash = hashlib.sha1()
hash.update(f'string:{len(json_data)}:'.encode('utf-8')) # The line in question
hash.update(json_data)
res = hash.hexdigest()
print(res)
Having done that, the hashes produced by hashlib
and crypto
are the same as those of object-hash
.
Note: This is not documented and I had to look through the source code to find exactly how to prefix strings in particular. Other types have different prefixes.