I'm setting up Airflow right now and loving it, except for the fact that my dags are perpetually running behind. See the picture below - this was taken on 2/19 at 15:50 UTC, and you can see that for each of the dags, they should have run exactly one more time between the last time they ran and the present time (there are a couple for which this is not true - those ones are currently turned off). Is there some piece of configuration I missed?
False alarm! Airflow just labels execution times differently than how I expected. Turns out an hourly job that runs at 15:00 is labels "14:00" and includes data up to 14:00+1:00.
From https://airflow.apache.org/scheduler.html:
Note that if you run a DAG on a schedule_interval of one day, the run stamped 2016-01-01 will be trigger soon after 2016-01-01T23:59. In other words, the job instance is started once the period it covers has ended.
Let’s Repeat That The scheduler runs your job one schedule_interval AFTER the start date, at the END of the period.