I try to calculate sha1 for encrypted file (file.gpg) using Python3 code.
I test two func.
import hashlib
import gnupg
def sha1sum(filename):
h = hashlib.sha1()
b = bytearray(128*1024)
mv = memoryview(b)
with open(filename, 'rb', buffering=0) as f:
for n in iter(lambda : f.readinto(mv), 0):
h.update(mv[:n])
return h.hexdigest()
def sha1_checksum(filename, block_size=65536):
sha1 = hashlib.sha1()
with open(filename, 'rb') as f:
for block in iter(lambda: f.read(block_size), b''):
sha1.update(block)
return sha1.hexdigest()
original = open('file.bin', 'rb')
gpg = gnupg.GPG()
gpg.encoding = 'utf-8'
encrypt = gpg.encrypt_file(original,
recipients=None,
passphrase=password,
symmetric='AES256',
output=file)
sum = sha1sum(file)
sum = sha1_checksum(file)
First start of the script
697cee13eb4c91f41922472d8768fad076c72166
697cee13eb4c91f41922472d8768fad076c72166
Second start of the script
a95593f0d8ce274492862b58108a20700ecf9d2b
a95593f0d8ce274492862b58108a20700ecf9d2b
Does sha1sum() or sha1_checksum() wrong?
Or file encryption gives different file.gpg ?
This is not a problem of Python, or even gpg.
The reason the hash changes is that gpg
asymmetric encryption is non-deterministic, or so-called probabilistic.
Quote from wiki page Probabilistic encryption
Probabilistic encryption is the use of randomness in an encryption algorithm, so that when encrypting the same message several times it will, in general, yield different ciphertexts. The term "probabilistic encryption" is typically used in reference to public key encryption algorithms, however various symmetric key encryption algorithms achieve a similar property (e.g., block ciphers when used in a chaining mode such as CBC).