Search code examples
python-3.xgitairflowpython-module

How to use non-installable modules from DAG code?


I have a Git repository which (among other things) holds Airflow DAGs in airflow directory. I have a clone of the repository besides an install directory of Airflow. airflow directory in Git is pointed to by AIRFLOW_HOME configuration variable.

I would like to allow imports from modules in the repository that are listed outside airflow folder (please see the structure below).

<repo root>
   |_airflow
      |_dags
         |_dag.py
   |_module1
   |_module2
   |_...

So that in dag.py I can do:

from module1 import Module1

Currently, it does not seem possible without tricks like editing sys.path explicitly which is not very elegant and has to be done in each of the dag source files...

Making an installable package out of the module1 is also out of the question.


Solution

  • Re-writing conclusion from discussions here


    Broadly, there are 2 possible ways

    1. Package your code into an Airflow plugin
    2. Make your code discoverable to dag-definition-file(s) parsing processes by updating PYTHONPATH. Here again we have following options

      (a) Update PYTHONPATH on system level using bashrc / equivalent (once-and-for-all) or just export the updated PYTHONPATH for current bash session

      (b) Programmatically update sys.path in the beginning of DAG-definition file