We have "schedule_interval" attribute in DAG function to provide the cron expression to fulfill my requirement. I think there is a limitation with cron that we cannot make a job/task run for every consecutive 25days. Below is the cron expression to run a job for every 25th day of month.
5 10 */25 * *
But I need a job/DAG to run for every consecutive 25days. Is there a way to run the DAG to satisfy my requirement?
You can set schedule_interval
using datetime.timedelta
.
For example, to schedule a DAG to run for the first time in 25 days from today at 10:05 CET and then run every 25 days, the DAG script can be specified like this:
import pendulum
from datetime import datetime, timedelta
from airflow import DAG
from airflow.operators.dummy_operator import DummyOperator
default_args = {
'owner': 'Airflow',
'start_date': datetime(
2019, 11, 24, 10, 5, tzinfo=pendulum.timezone('Europe/Berlin')
),
}
with DAG(
'my_dag', schedule_interval=timedelta(days=25), default_args=default_args,
) as dag:
op = DummyOperator(task_id='dummy')