Search code examples
airflowairflow-schedulerairflow-2.x

Setting multiple DAG dependency in Airflow


I have multiple Ingestion DAGs -> 'DAG IngD1', 'DAG IngD2', 'DAG IngD3' , and so on which ingest data for individual tables.

After the ingestion DAGs are completed successfully, I want to run a single transformation DAG -> 'DAG Tran'. Which means that the DAG 'DAG Tran' should be triggered only when all the ingestion Dags 'DAG IngD1', 'DAG IngD2' and 'DAG IngD3' have successfully finished.

To achieve this if I use the ExternalTaskSensor operator, the external_dag_id parameter is a string and not a list. Which means that I need to have three ExternalTaskSensor operator in my 'DAG Tran' for each ingestion DAG? Is my understanding correct or is there an easy way?


Solution

  • Currently, meet dag dependency management problem too.

    My solution is to set a mediator(dag) to use task flow to show dag dependency.

    # create mediator_dag to show dag dependency
    mediator_dag():
     trigger_dag_a = TriggerDagRunOperator(dagid="a")
     trigger_dag_b = TriggerDagRunOperator(dagid="b")
     trigger_dag_c = TriggerDagRunOperator(dagid="c")
     
    # taskflow
    trigger_dag_a >> [trigger_dag_b, trigger_dag_c]
    

    Cross-DAG dependencies in Apache Airflow This article might help you !