The title is the question, can storing an API-key in .env file for a containerized Python app be considered safe? While writing, a Nginx/Docker set-up question came to mind as well, asked at the bottom.
A little background, I've made a Python app that I want to deploy with Streamlit on a Linux-based server. All is hobby-wise but aiming to learn industry practices. Therefore, I considered the web app in production environment as opposed to development.
The script MyClass.py
looks like this, API-key is stored in same directory as .env
. (mock-up example, may be incomplete):
import os
from dotenv import load_dotenv
load_dotenv()
api_key = os.getenv('Super_Private_API_key')
class MyClass:
def __init__(self, foo: str):
self.api_answer = _api_call(foo, api_key)
def _api_call(foo, key) -> str:
# Make API call with foo and api_key and convert to string
return string
And the app looks like:
import streamlit as st
import MyClass as mc
string = st.text_input(placeholder="put 'foo' or 'bar' here", label="input")
my_variable = mc.MyClass(string)
# Return the API answer
st.write(my_variable.api_answer)
The set-up would be a Ubuntu server with a Docker container. The container would contain also Ubuntu or Alpine (for lightweigth), Nginx and Python 3.12. I would set-up the proper network to display to the outside world.
Question(s):
.env
-file, placed in the same directory as MyClass.py
, placed in a virtual (needed when in container?) Python environment to run my app, can that be considered safe/secure?MyClass.py
and app.py
within the container file system be safer? E.g. ~/myapp/app.py
and ~/my_scripts/MyClass.py
and make the class globally accessible..env
-file in the base Linux and let the container only-read it? Though that feels same difference to me.Additional question when writing this:
I am aware that there are secret managers such as Vault or Secret Manager but for me it's one step at a time. I am also trying to immerse myself in the mechanics and safeties/vulnerabilities of the tools I am using. In the end, I imagine, using a secret manager is easier to learn than understanding the extend of what to do and what not to do, with tools you are using.
At the end of the day, your program is going to need to have this value, in a variable, in plain text. So the question is how many people do you want to be able to see it before then?
There is a lot of "it depends" here. If your application is handling real-world money or particularly sensitive data, you might want (or need) some more aggressive security settings, and your organization (or certification) might require some specific setups. I can also envision an environment where what you've shown in the question is just fine.
... industry practices ... production environment ...
Set up some sort of credential store. Hashicorp Vault is a popular usually-free-to-use option; if you're running in Amazon Web Services, it has a secret manager; there are choices that allow you to put an encrypted file in source control and decrypt it at deploy time; and so on. From there you have three choices:
.env
).There's a tradeoff of complexity vs. security level here. I've worked with systems that extract the credentials into Kubernetes Secrets which get turned into environment variables; in that setup, given the correct Kubernetes permissions, you could read back the Secret. But given the correct permissions, you could also launch a Pod that mounted the Secret and read it that way, or with a ServiceAccount that allowed it to impersonate the real service to the credential store.
The main points here are that an average developer doesn't actually have the credential, and that the credential can be different in different environments.
[Is] the API-key, placed in a
.env
-file ... safe/secure?
If the file isn't checked into source control, and nobody else can read it (either through Unix permissions or by controlling login access to the box), it's probably fine, but it depends on your specific organization's requirements.
Would separating
MyClass.py
andapp.py
within the container file system be safer?
Your credential isn't actually written in either of these files, so it makes no difference. These files should both be in the image, and the credential should be injected at deploy time.
Is it safer to place the
.env
-file in the base Linux and let the container only-read it?
Traditionally, environment variables are considered a little less secure: if you're logged into the box, ps
can often show them, where a credential file can be set to a mostly-unreadable mode 0400. I might argue this is a little less relevant in a container world, but only because in practice access to the Docker socket gives you unrestricted root-level access anyways. In Kubernetes, even if a developer has access to the Kubernetes API, they won't usually be able to directly log into the nodes.
This means there's an argument to put the credential in a file not named .env
. However, if you're trying to make the file user-read-only (mode 0400) then you need to be very specific about the user ID the container is using, and some images don't necessarily tolerate this well. (If your code in the image is owned by root and world-readable but not writable, and you don't write to any directory that's not externally mounted, you're probably in good shape.)
... install Nginx on the base Linux ...
Unless you have some specific configurations on that Nginx proxy that you think will improve security in some way, there's not a security-related reason to do this. There's an argument that adding an additional software layer decreases security, by adding another thing that could have exploitable bugs (Nginx has a pretty good track record though).
You might want this reverse proxy anyways for other reasons, and it's totally reasonable to add it. That could let you run multiple applications on the same host without using multiple ports. Having a proxy is pretty common (I keep saying Kubernetes, and its Ingress and Gateway objects provide paths to set one up). Having two proxies is IME generally considered mildly unsightly but not a real problem.