Search code examples
pythonairflowairflow-scheduler

Is there a way to limit Apache Airflow catchup interval


In Apache Airflow, if we set a DAG's catchup to be True, it will schedule all the runs that were not progressed since the start_date. So in case I turn off a DAG and then turn it on 1 year later, it will schedule tons of runs. And I want to avoid this. So is there any way to set a specific interval for catchup? For example, only catchup the runs that are within 1 month in the past from the current time. Thanks a lot in advance!


Solution

  • DAGs have start_date but they also have optional parameter of end_date.

    You should set end_date for your DAG.

    DAG(
        dag_id='my_dag',
        ...,
        catchup=True,
        start_date=datetime(2021, 1, 1),
        end_date=datetime(2022, 2, 1),
    )