I'm using Flask and the Cryptography package to receive a .csv file from the user, encrypt it, store it in the MySQL Database and then, retrieving and decrypting it.
Once I get the file uploaded from the user I do the following to encrypt it:
from cryptography.fernet import Fernet
# Read the file as bytes
# file_bytes is <class 'werkzeug.datastructures.FileStorage'>
file_bytes = file.stream.read()
# Get the main encrypter/decrypter
f = Fernet(app.config['CRYPTOGRAPHY_KEY'])
# Encrypt the bytes
# encrypted is <class 'bytes'> , This has 526156 Bytes
encrypted = f.encrypt(file_bytes)
# Convert the encrypted bytes to String
# encrypted_string it is <class 'str'>
encrypted_string = encrypted.decode()
# Store the encrypted file into the MySQL DB in a BLOB field
# flask_mysqldb is being used here.
store_file(encrypted_string)
Once this is stored in the DB I see the BLOB variable to have 65535 Bytes. I can see this using MySQL Workbench
As I retrieve the encrypted file from the DB I receive
# encrypted_bytes_database it is <class 'bytes'> , This has 65535 Bytes
encrypted_bytes_database = retrieve_file_from_db(user_id)
When I try to decrypt it using:
f.decrypt(encrypted_bytes_database)
The following error is raised:
File "MY_ENV/lib/python3.8/site-packages/cryptography/fernet.py", line 104, in _get_unverified_token_data
raise InvalidToken
cryptography.fernet.InvalidToken
If I decrypt the encrypted
variable it works fine with no errors, but when I try to decrypt encrypted_bytes_database
I get this error. Any idea of what is going wrong here?
I just found out that the problem was in the MySQL Datatype.
BLOB: Can handle up to 65,535 bytes of data.
MEDIUMBLOB: The maximum length supported is 16,777,215 bytes.
Changing the datatype to MEDIUMBLOB in my DB makes everything work ok.
This is how I am decrypting encrypted_bytes_database
and reading this .csv file as a pandas df.
from cryptography.fernet import Fernet
import pandas as pd
import io
decrypted_db = f.decrypt(encrypted_bytes_database)
df = pd.read_csv(io.StringIO(decrypted_db.decode()))