Search code examples
htmldatabasesecurityaccount

How do you protect data that you don't want to hash?


Correct me if any of my assumptions are off.

When you hash something like with sha1, you can't reverse the hashed data to get the original string.

Because of this, if I have an email, which I will need to use later, stored in the database, I can't use sha1 on it.

However, I still want to protect in case of a breach, so what do I do?

I'm using django which stores a secret_key in settings.py.

I tried using AES encryption, but noticed that as the string encoded is longer, the encrypted string returned is longer, which makes sense. However, the encryption string is very much longer than the original string. Is there a type of encryption where the string returned is the same size of the original string? Cuz I'm using django user model and the email is limited to 75, so if a user used a 32-75char email, the encrypted string is 128 in length which is > 75, so it can't be stored in the column.


Solution

  • The three key concepts of information security are confidentiality, integrity, and availability. In your case, a cryptographic hash like SHA1 provides integrity: you can always check against the hash value to see if the email has been tampered with. In your case, you want confidentiality, which a hash function will not provide: you want emails to be unreadable in the database in case the database is compromised. While a symmetric encryption algorithm is part of the answer, the bigger question is about key management. Once you have a key to encrypt and decrypt emails, how will you store it? How many people will have access to the key? Will it be kept on the same computer as the database? (That's dangerous.) Will it be kept on the same network? How often will you change the keys? What happens if you lose the key? In all likelihood, your infrastructure will be just as vulnerable to a data breach with unencrypted emails as you would be an encrypted ones. Security is hard, and it's better to focus your efforts on auditing your database setup -- which is something many people have done, has well-known and production-tested solutions -- rather than creating a complicated cryptographic system.