Search code examples
python-3.xencryptionprivacydata-masking

InvalidToken by decrypting Column in Table using Python


does anyone have an idea how to solve the issue with the invalidtoken?

The key(s) should be the same, but still I am getting this invalid token error.

I attached my code and an image of the fake table.

Cheers!

import pandas as pd
from cryptography.fernet import Fernet, InvalidToken

# Create an example dataframe
data = {'Name': ['Alice', 'Bob', 'Charlie'],
'Salary': [50000, 70000, 90000]}
df = pd.DataFrame(data)

# Generate a key for encryption
key = Fernet.generate_key()

# Create a Fernet object using the key
f = Fernet(key)

# Encrypt the 'Salary' column
df['Salary'] = df['Salary'].apply(lambda x: f.encrypt(str(x).encode()))

# Save the encrypted data to a CSV file
df.to_csv('encrypted_data.csv', index=False)

# Load the encrypted data from the CSV file
df = pd.read_csv('encrypted_data.csv')

# Decrypt the 'Salary' column
try:
  df['Salary'] = df['Salary'].apply(lambda x: int(f.decrypt(x.encode()).decode()))
except InvalidToken as e:
  print(f"Error: {e}")
  print(f"Key: {key}")
  raise

# Print the decrypted data
print(df)

enter image description here

I tried the code above and was expecting to decrypt the column salary. However, I got an invalidtoken error.


Solution

  • f.encrypt(...) returns a bytes-like object. When storing with to_csv() the string represenation is stored b'...', which can be seen in your screenshot.
    When loading with read_csv() this string is loaded and x.encode() results in a b"b'...'" which causes the decryption to fail.

    To avoid this the ciphertext has to be UTF-8 decoded when encrypting:

    df['Salary'] = df['Salary'].apply(lambda x: f.encrypt(str(x).encode()).decode())
    

    Then decryption works and print(df) returns for the encrypted and decrypted data:

    enter image description here