Search code examples
airflowstatusdirected-acyclic-graphsairflow-2.xhealth-monitoring

How to enable DAG Processor status in Airflow


I have an Airflow cluster running with Docker. Everything seems to work fine except DAG Processor is not properly bound.

When checking the cluster status it is set to Unknown:

enter image description here

Thus I am investigating how to make the DAG Processor able to get its status right.

I have confirmed components are running fine and do have healthy status:

docker ps | grep airflow
b3945b8cd533   apache/airflow:2.7.1   "/usr/bin/dumb-init …"   3 days ago   Up 3 days (healthy)   8080/tcp        airflow_airflow-webserver_1
c19eb3571734   apache/airflow:2.7.1   "/usr/bin/dumb-init …"   3 days ago   Up 3 days (healthy)   8080/tcp        airflow_airflow-worker_1
07403ea2cdd0   apache/airflow:2.7.1   "/usr/bin/dumb-init …"   3 days ago   Up 3 days (healthy)   8080/tcp        airflow_airflow-scheduler_1
6b53355eae24   apache/airflow:2.7.1   "/usr/bin/dumb-init …"   3 days ago   Up 3 days (healthy)   8080/tcp        airflow_airflow-triggerer_1
1f8dfbc2766b   redis:latest           "docker-entrypoint.s…"   4 days ago   Up 4 days (healthy)   6379/tcp        airflow_airflow-cache_1
53c2926b6eea   postgres:16            "docker-entrypoint.s…"   4 days ago   Up 4 days (healthy)   5432/tcp        airflow_airflow-database_1

Checking the configuration, logs are properly defined:

dag_processor_manager_log_location = /opt/airflow/logs/dag_processor_manager/dag_processor_manager.log

And the file does exist at the required location:

docker exec -it airflow_airflow-worker_1 bash
ls -l /opt/airflow/logs/dag_processor_manager
total 18444
-rw-rw-r-- 1 default root 18880639 Oct 10 06:15 dag_processor_manager.log

Here is my questions: Why my DAG Processor status is Unknown? And, how should I configure the cluster to get this status properly bound?


Solution

  • It doesn't look like you are running a standalone DagProcessorManager component in your environment. As of Airflow 2.3.0, the DagProcessorManager can be run as a separate process to take decouple some responsibility away from the Scheduler.

    This configuration is disabled by default, but you can enable it via the standalone-dag-processor config. Once enabled, the DAG Processor health should display as you'd like from the /health endpoint.