Search code examples
pythonairflowairflow-scheduler

Stop performing remaining tasks in airflow


I have three tasks t1,t2,t3. each task output is next task input for eg., t1 output is t2 input. After completion of t1, I am getting an empty output folder(which can happen in my case and it is acceptable and marked t1 as success) but t2 is failed to fetch the output of t1 as there are no files. I want to mark t2 and t3 as success if there are no files. How can I skip the next two tasks.


I went through the airflow docs and other articles came across sensors and poke method. But, not sure how to proceed with that.


Solution

  • You can leverage a SensorOperator more specifically the FileSensorOperator to check if a file exists. You can then use the soft_fail arg to mark the tasks as "skipped" when the file does not exist. This will allow the DAG to succeed while maintaining the proper history of what occurred on the file check.