I am trying to use HMAC
authentication for reading and write pickle
files.
Sample Data :
import base64
import hashlib
import hmac
from datetime import datetime
import six
import pandas as pd
import pickle
df1 = pd.DataFrame({'id' : [1,2,3,4,5],
'score' : [720, 700, 710, 690, 670]})
df2 = pd.DataFrame({'name' : ['abc', 'pqr', 'xyz'],
'address' : ['1st st', '2nd ave', '3rd st'] })
mylist = ['a', 'b', 'c', 'd', 'e']
mydict = {1 : 'p', 2 : 'q', 3 : 'r'}
obj = [df1, df2, mylist, mydict]
Write pickle file using:
data = pickle.dumps(obj)
digest = hmac.new(b'unique-key-here', data, hashlib.blake2b).hexdigest()
with open('temp.txt', 'wb') as output:
output.write(bytes(digest, sys.stdin.encoding) + data)
But when I try to read it back using:
with open('temp.txt', 'rb') as f:
digest = f.readline()
data = f.read()
recomputed = hmac.new(b'unique-key-here', data, hashlib.blake2b).hexdigest()
if not compare_digest(digest, bytes(recomputed, sys.stdin.encoding)):
print('Invalid signature')
else:
print('Signature matching')
I am getting Invalid signature
as output. Could someone please help me understand where I am going wrong.
Here is some code illustrating what I think is a cleaner way to solve the problem.
import hashlib
import hmac
import io
import os
import pickle
sample_obj = {'hello': [os.urandom(50)]}
data = pickle.dumps(sample_obj)
# write it out
my_hmac = hmac.new(b'my_hmac_key', digestmod=hashlib.blake2b)
my_hmac.update(data)
mac_result = my_hmac.digest()
pickle_out = io.BytesIO()
pickle_out.write(mac_result + data)
# read it in
pickle_in = io.BytesIO(pickle_out.getbuffer().tobytes())
my_hmac = hmac.new(b'my_hmac_key', digestmod=hashlib.blake2b)
mac_from_stream = pickle_in.read(my_hmac.digest_size)
data_from_stream = pickle_in.read()
my_hmac.update(data_from_stream)
computed_mac = my_hmac.digest()
# see if they match
print(hmac.compare_digest(computed_mac, mac_from_stream))
We avoid hexdigest()
all together and thus eliminate unnecessary encoding and decoding. We create the mac instance and keep it around so that we can get the hmac.digest_size
property. The use of io.BytesIO
is just for illustrating the I/O part of your code.