Search code examples
pythonairflowdirected-acyclic-graphs

Broken DAG issue (Airflow 2.5.0)


Broken DAG: [/opt/airflow/dags/dag.py] Traceback (most recent call last):
  File "/opt/airflow/dags/dag.py", line 7, in <module>
    from training import training
  File "/opt/airflow/dags/training.py", line 6, in <module>
    from joblib import dump
ModuleNotFoundError: No module named 'joblib'

I have 'joblib' module installed already then why it is showing this module not found error??


Solution

  • Below answer's discussion can be seen on (https://github.com/apache/airflow/discussions/28661)

    You should build your own image and extend it by adding the packages you need https://airflow.apache.org/docs/docker-stack/build.html#adding-new-pypi-packages-individually

    Using _PIP_ADDITIONAL_REQUIREMENTS is highly discouraged for anything but the quick iteration while debugging your installation (see the note for details).

    You don't even have to do much. Docker compose fully supports automatic building image with new dependencies and using it. See this comment in the dockerfile:

    https://github.com/apache/airflow/blob/main/docs/apache-airflow/howto/docker-compose/docker-compose.yaml#L46

    1. In order to add custom dependencies or upgrade provider packages you can use your extended image.
    2. Comment the image line, place your Dockerfile in the directory where you placed the docker-compose.yaml and uncomment the "build" line below,
    3. Then run docker-compose build to build the images. image: ${AIRFLOW_IMAGE_NAME:-apache/airflow:|version|} build: .

    You can also run docker compose up --build as a shortcut if you do not want to run docker compose build separately.