Search code examples
airflowairflow-scheduler

Airflow: Simple DAG with one task never finishes


I have made a very simple DAG that looks like this:

from datetime import datetime
from airflow import DAG
from airflow.operators.bash_operator import BashOperator

cleanup_command = "/home/ubuntu/airflow/dags/scripts/log_cleanup/log_cleanup.sh "

dag = DAG(
'log_cleanup',
description='DAG for deleting old logs',
schedule_interval='10 13 * * *',
start_date=datetime(2018, 3, 30),
catchup=False,
)

t1 = BashOperator(task_id='cleanup_task', bash_command=cleanup_command, dag=dag)

The task finishes successfully but despite of this the DAG remains in "running" status. Any idea what could cause this. The screenshot below show the issue with the DAG remaining running. The earlier runs are only finished because I manually mark status as success. [Edit: I had originally written: "The earlier runs are only finished because I manually set status to running."]

Screenshot showing that status of task is finished but DAG is still running


Solution

  • The earlier runs are only finished because I manually set status to running.

    Are you sure your scheduler is running? You can start it with $ airflow scheduler, and check the scheduler CLI command docs You shouldn't have to manually set tasks to running.

    Your code here seems fine. One thing you might try is restarting your scheduler.

    In the Airflow metadata database, DAG run end state is disconnected from task run end state. I've seen this happen before, but usually it resolves itself on the scheduler's next loop when it realizes all of the tasks in the DAG run have reached a final state (success, failed, or skipped).

    Are you running the LocalExecutor, SequentialExecutor, or something else here?