In Apache Airflow, if we set a DAG's catchup to be True
, it will schedule all the runs that were not progressed since the start_date
. So in case I turn off a DAG and then turn it on 1 year later, it will schedule tons of runs. And I want to avoid this. So is there any way to set a specific interval for catchup? For example, only catchup the runs that are within 1 month in the past from the current time.
Thanks a lot in advance!
DAGs have start_date
but they also have optional parameter of end_date
.
You should set end_date
for your DAG.
DAG(
dag_id='my_dag',
...,
catchup=True,
start_date=datetime(2021, 1, 1),
end_date=datetime(2022, 2, 1),
)