I currently have a Docker container with a python image where I run the cronjobs. I use a docker compose file to run it, where I pass the tokens from the host (my Macbook) to the container as environment variables.
version: '3.8'
services:
backend:
container_name: py-cont
build: .
environment:
GOOGLE_ADS_TOKEN: ${GOOGLE_ADS_TOKEN}
I would like to migrate the cronjobs to AirFlow (run using the docker-compose file from here) where I want to use the DockerOperator, but I do not know how to pass the environment variables from the host, to achieve exactly what my docker-compose did.
This is my DAG, which throws the KeyError in the AirFlow logs when run, trying to fetch the env var that doesn't exist (the var is sourced on the host, confirmed by echo-ing it):
DockerOperator(
dag=dag,
task_id='refresh_tickers',
image='mypythonimage',
api_version='auto',
auto_remove=True,
environment={
'GOOGLE_ADS_TOKEN': os.environ['GOOGLE_ADS_TOKEN']
},
command='echo $GOOGLE_ADS_TOKEN',
docker_url='tcp://docker-proxy:2375',
network_mode='bridge',
)
I'm new to AirFlow, and it could be I am misunderstanding what the host in this case is, is it one of the many containers defined in the docker-compose, and not my macbook? This would be confusing, because the volumes parameter to DockerOperator mounts my local (Macbook) filepath to the container with no problems.
Thanks.
First step is checking if you have the env variable exists in your host:
echo $GOOGLE_ADS_TOKEN
Second step is adding the env variable to all the scheduler and all the workers containers, to do that you need to update the docker-compose file
version: '3'
x-airflow-common:
&airflow-common
# In order to add custom dependencies or upgrade provider packages you can use your extended image.
# Comment the image line, place your Dockerfile in the directory where you placed the docker-compose.yaml
# and uncomment the "build" line below, Then run `docker-compose build` to build the images.
image: ${AIRFLOW_IMAGE_NAME:-apache/airflow:2.3.3}
# build: .
environment:
&airflow-common-env
AIRFLOW__CORE__EXECUTOR: CeleryExecutor
...
_PIP_ADDITIONAL_REQUIREMENTS: ${_PIP_ADDITIONAL_REQUIREMENTS:-}
GOOGLE_ADS_TOKEN: ${GOOGLE_ADS_TOKEN}
Last step is checking if the env variable exists in the worker container:
docker-compose exec airflow-worker bash -c 'echo "$GOOGLE_ADS_TOKEN"'