i have something i don't understand with the execution date. I have the following dag :
from airflow import DAG
from airflow.operators.bash_operator import BashOperator
from datetime import datetime
default_args = {
'owner': 'me',
'depends_on_past': True,
'email': '[email protected]',
'email_on_failure': False,
'email_on_retry': False,
'retries': 0,
}
dag = DAG(
'dag_test',
default_args=default_args,
description="DAG test",
schedule_interval='0 15 * * *',
concurrency=1,
catchup=False,
start_date=datetime(2024, 1, 1)
)
task = BashOperator(
task_id='task',
bash_command='echo 1',
dag=dag,
)
When i activate the dag, it is running everyday at 3PM but the execution date is the day before. Example : when the dag is triggered on the 16th february, the execution date is 15th february.
Thanks for your help.
I expect to have the same date between the trigger and the execution date.
You need to have a look at data-interval for DAG runs.
A DAG run is usually scheduled after its associated data interval has ended, to ensure the run is able to collect all the data within the time period. In other words, a run covering the data period of 2020-01-01 generally does not start to run until 2020-01-01 has ended, i.e. after 2020-01-02 00:00:00.
A DAG run is executed at the end of the period of time it covers to respect idempotence principles.
A best practice to design DAGs is to handle data using time-partitioning.