Search code examples
airflowkubernetes-helm

How to automatically refresh DAGs through docker image in helm chart deployed Airflow


I have a deployment of Airflow running in a kubernetes cluster. I deployed it there using the official helm chart (described here). I manage DAGs using the recommended way, by "baking" them into a docker image (described here).

When I create or change a DAG, I run the docker build and docker push commands as listed in the documentation. This all works great. However, my changes don't show up in the GUI until I delete the scheduler and the webserver pods (kubectl delete pod airflow-scheduler-xxx, etc), forcing kubernetes to spin up new ones. This is not ideal for CI/CD purposes, as I don't want to continuously have to do this manually. Is there a way to pick up the changes automatically (e.g. by pulling the image periodically)?

I already tried to set dag_dir_list_interval in airflow.cfg (suggested here) through the helm values, but this doesn't seem to change anything. PullPolicy is set to Always because I use the latest tag. It looks like this in my override.yaml:

images:
  airflow:
    repository: -registry-name-
    tag: latest
    pullPolicy: Always

config: 
  scheduler: 
    # after how much time a new DAGs should be picked up from the filesystem
    min_file_process_interval: 0
    dag_dir_list_interval: 60

Solution

  • I was eventually able to figure this out myself. The reason the pods weren't updating (i.e. weren't pulling the new image with the updated DAGs) is because I was using the latest tag. After building and pushing the new image, the cluster didn't know to pull that new image, because the tag was still the same. The solution is to not use the latest tag, but to use a tag that changes every time CI/CD runs. I chose to use $(Build.SourceVersion) in Azure Pipelines, which corresponds to the git hash of the commit that caused the pipeline to run. Then I added a helm upgrade command with a flag pointing to the new image tag. This command causes the relevant pods to be updated automatically.

    helm upgrade --install airflow apache-airflow/airflow --set images.airflow.tag="$(Build.SourceVersion)"