I am getting this error when running any poetry executables
Traceback (most recent call last):
File "/root/.local/bin/poetry", line 5, in <module>
from poetry.console.application import main
File "/root/.local/share/pypoetry/venv/lib/python3.8/site-packages/poetry/console/application.py", line 11, in <module>
from cleo.application import Application as BaseApplication
ModuleNotFoundError: No module named 'cleo'
My container is built using this logic.
FROM gcr.io/dataflow-templates-base/python38-template-launcher-base:flex_templates_base_image_release_20230508_RC00
ARG DIR=/dataflow/template
ARG dataflow_file_path
ARG PROJECT_ID
# environment to pull the right containers
ARG ENV
ARG TOKEN
ENV COMPOSER_$ENV=1
# copying over necessary files
RUN mkdir -p ${DIR}
WORKDIR ${DIR}
COPY transform/dataflow/${dataflow_file_path}.py beam.py
COPY deploy/dataflow/poetry.lock .
COPY deploy/dataflow/pyproject.toml .
# env var in order to use custom lib, for more info, see:
# https://cloud.google.com/dataflow/docs/guides/templates/configuring-flex-templates#set_required_dockerfile_environment_variables
ENV FLEX_TEMPLATE_PYTHON_PY_FILE="${DIR}/beam.py"
ENV FLEX_TEMPLATE_PYTHON_EXTRA_PACKAGES=""
ENV FLEX_TEMPLATE_PYTHON_PY_OPTIONS=""
ENV PIP_NO_DEPS=True
# install poetry
RUN curl -sSL https://install.python-poetry.org | python -
ENV PATH "/root/.local/bin/:${PATH}"
RUN poetry --version
I have tried uninstalling it and the suggestions from:
These aren't really applicable because they are poetry
executables in a non-docker environment but not really sure what else to do . I have an Dataflow SDK that was build from apache/beam_python3.8_sdk:2.45.0
that has the same logic and it is working.
I disregarded the last RUN
command and built the container, these are the outputs of some checks
❯ docker run --rm --entrypoint /bin/bash dataflow -c 'which poetry'
/root/.local/bin/poetry
❯ docker run --rm --entrypoint /bin/bash dataflow -c 'poetry'
Traceback (most recent call last):
File "/root/.local/bin/poetry", line 5, in <module>
from poetry.console.application import main
File "/root/.local/share/pypoetry/venv/lib/python3.8/site-packages/poetry/console/application.py", line 11, in <module>
from cleo.application import Application as BaseApplication
ModuleNotFoundError: No module named 'cleo'
❯ docker run --rm --entrypoint /bin/bash dataflow -c 'which python'
/usr/local/bin/python
My assumption is that the poetry
executables are importing in a virtualenv that the other libraries aren't installed to.
UPDATE:
I went down a rabbit hole and did this
RUN pip install --no-cache-dir poetry cleo rapidfuzz importlib_metadata zipp crashtest
And running poetry --version
worked but poetry config virtualenvs.create false
or any other command will throw this error
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/bin/poetry", line 8, in <module>
sys.exit(main())
File "/usr/local/lib/python3.8/site-packages/poetry/console/application.py", line 409, in main
exit_code: int = Application().run()
File "/usr/local/lib/python3.8/site-packages/cleo/application.py", line 338, in run
self.render_error(e, io)
File "/usr/local/lib/python3.8/site-packages/poetry/console/application.py", line 180, in render_error
self.set_solution_provider_repository(self._get_solution_provider_repository())
File "/usr/local/lib/python3.8/site-packages/poetry/console/application.py", line 398, in _get_solution_provider_repository
from poetry.mixology.solutions.providers.python_requirement_solution_provider import ( # noqa: E501
File "/usr/local/lib/python3.8/site-packages/poetry/mixology/__init__.py", line 5, in <module>
from poetry.mixology.version_solver import VersionSolver
File "/usr/local/lib/python3.8/site-packages/poetry/mixology/version_solver.py", line 8, in <module>
from poetry.core.packages.dependency import Dependency
ModuleNotFoundError: No module named 'poetry.core'
I found the issue. It appears to be the ordering of the Dockerfile
when building a Dataflow Flex template
This will work
# THIS WILL BE MOVED
RUN curl -sSL https://install.python-poetry.org | python3 -
ENV PATH "/root/.local/bin/:${PATH}"
# copying over necessary files
RUN mkdir -p ${DIR}
WORKDIR ${DIR}
COPY transform/dataflow/${dataflow_file_path}.py beam.py
COPY deploy/dataflow/poetry.lock .
COPY deploy/dataflow/pyproject.toml .
# env var in order to use custom lib, for more info, see:
# https://cloud.google.com/dataflow/docs/guides/templates/configuring-flex-templates#set_required_dockerfile_environment_variables
ENV FLEX_TEMPLATE_PYTHON_PY_FILE="${DIR}/beam.py"
ENV FLEX_TEMPLATE_PYTHON_EXTRA_PACKAGES=""
ENV FLEX_TEMPLATE_PYTHON_PY_OPTIONS=""
ENV PIP_NO_DEPS=True
But this will not
# copying over necessary files
RUN mkdir -p ${DIR}
WORKDIR ${DIR}
COPY transform/dataflow/${dataflow_file_path}.py beam.py
COPY deploy/dataflow/poetry.lock .
COPY deploy/dataflow/pyproject.toml .
# env var in order to use custom lib, for more info, see:
# https://cloud.google.com/dataflow/docs/guides/templates/configuring-flex-templates#set_required_dockerfile_environment_variables
ENV FLEX_TEMPLATE_PYTHON_PY_FILE="${DIR}/beam.py"
ENV FLEX_TEMPLATE_PYTHON_EXTRA_PACKAGES=""
ENV FLEX_TEMPLATE_PYTHON_PY_OPTIONS=""
ENV PIP_NO_DEPS=True
# THIS MOVED
RUN curl -sSL https://install.python-poetry.org | python3 -
ENV PATH "/root/.local/bin/:${PATH}"
Looking a bit deeper, it may be that the environment variables are at play, although I'm not quite exactly how. This is the environment variables of when there are errors
This is the env
of the working image
❯ docker run --rm --entrypoint /bin/bash dataflow -c 'env'
_=/usr/bin/env
PATH=/root/.local/bin/:/usr/local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/gcloud/google-cloud-sdk/bin
PYTHON_GET_PIP_URL=https://github.com/pypa/get-pip/raw/1a96dc5acd0303c4700e02655aefd3bc68c78958/public/get-pip.py
PYTHON_GET_PIP_SHA256=d1d09b0f9e745610657a528689ba3ea44a73bd19c60f4c954271b790c71c2653
LD_LIBRARY_PATH=/usr/local/lib
PYTHON_PIP_VERSION=22.0.4
SHLVL=0
LANG=C.UTF-8
HOME=/root
PYTHON_SETUPTOOLS_VERSION=57.5.0
PWD=/
CLOUDSDK_CORE_DISABLE_PROMPTS=yes
PYTHON_VERSION=3.8.16
This is the env
output of when it errors out
❯ docker run --rm --entrypoint /bin/bash dataflow -c 'env'
_=/usr/bin/env
PIP_NO_DEPS=True
PATH=/root/.local/bin/:/usr/local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/gcloud/google-cloud-sdk/bin
PYTHON_GET_PIP_URL=https://github.com/pypa/get-pip/raw/1a96dc5acd0303c4700e02655aefd3bc68c78958/public/get-pip.py
PYTHON_GET_PIP_SHA256=d1d09b0f9e745610657a528689ba3ea44a73bd19c60f4c954271b790c71c2653
LD_LIBRARY_PATH=/usr/local/lib
PYTHON_PIP_VERSION=22.0.4
SHLVL=0
FLEX_TEMPLATE_PYTHON_EXTRA_PACKAGES=
FLEX_TEMPLATE_PYTHON_PY_OPTIONS=
FLEX_TEMPLATE_PYTHON_PY_FILE=/dataflow/template/beam.py
LANG=C.UTF-8
HOME=/root
PYTHON_SETUPTOOLS_VERSION=57.5.0
PWD=/dataflow/template
CLOUDSDK_CORE_DISABLE_PROMPTS=yes
PYTHON_VERSION=3.8.16