Search code examples
etlairflowairflow-scheduler

Why are all of my Airflow dags one run behind?


I'm setting up Airflow right now and loving it, except for the fact that my dags are perpetually running behind. See the picture below - this was taken on 2/19 at 15:50 UTC, and you can see that for each of the dags, they should have run exactly one more time between the last time they ran and the present time (there are a couple for which this is not true - those ones are currently turned off). Is there some piece of configuration I missed?

my dags!


Solution

  • False alarm! Airflow just labels execution times differently than how I expected. Turns out an hourly job that runs at 15:00 is labels "14:00" and includes data up to 14:00+1:00.

    From https://airflow.apache.org/scheduler.html:

    Note that if you run a DAG on a schedule_interval of one day, the run stamped 2016-01-01 will be trigger soon after 2016-01-01T23:59. In other words, the job instance is started once the period it covers has ended.

    Let’s Repeat That The scheduler runs your job one schedule_interval AFTER the start date, at the END of the period.