I have a question and need help. Usually, in encrypted files, the file size is larger than the unencrypted file. Does entropy decrease at these times? I know the entropy is calculated like below in python:
print('myfile.text'.format)
with open(r"C:\Users\Parisa\Desktop\myfile.txt", 'rb') as f:
byteArr = list(f.read())
fileSize = len(byteArr)
print
print('File size in bytes: {:,d}'.format(fileSize))
# calculate the frequency of each byte value in the file
print('Calculating Shannon entropy of file. Please wait...')
freqList = []
for b in range(256):
ctr = 0
for byte in byteArr:
if byte == b:
ctr += 1
freqList.append(float(ctr) / fileSize)
# Shannon entropy
ent = 0.0
for freq in freqList:
if freq > 0:
ent = ent + freq * math.log(freq, 2)
ent = -ent
print('Shannon entropy: {}'.format(ent))[![enter image description here][1]][1]
stack.imgur.com/jMhkc.png
this is after encryption
this is before encryption
Entropy of the encrypted file is maximised. Entropy being a measure of "randomness" of the file.
This is how a text file usually looks, graphing entropy against the file contents:
This is how an encrypted file looks (ignore the tiny spike to low entropy, it was caused by some header information or equivalent):
Encrypted files have entropy near 1. If they didn't they wouldn't be well encrypted. Patterns == lower entropy, and lower entropy == bad encryption.