I have an Airflow DAG that looks like this (Airflow 1.10.15):
Cubes type:
I'm facing an issue that happens in rare use cases (which I still couldn't figure out), which is that the "end_of_data_collectors" cubes (DummyOperator) starts before the previous cubes finished. An important detail is that cube "get_rawdata_tables" creates a JSON file that describes the next cubes that should be opened (the cubes between "get_rawdata_tables" and "end_of_data_collectors") and then we open them on runtime - the DAG is not static (which I know is not officially supported or recommended, but it works - most of the time). All trigger rules are set to the default "on success"
I suspect that the problem is related to long parsing time of the DAG in case of a lot of dynamic cubes, but I'm not sure.
My questions are:
Thanks
I could figure it out, looks like Airflow scheduler ignores the Dummy Operator cube ("end_of_data_collectors"), skips it and marks it as "finish successfully" and then continues.
I also found an evidence to this behavior in Airflow code:
Looks like it wasn't a good idea to use Dummy Operator in dynamic use cases.