My goal is to have a gitlab CI/CD pipeline that builds my conda packages for me. For very large projects, conda is so slow that gitlab times out, so we are using mamba instead. Gitlab uses a Kubernetes runner, and what I've noticed is that my docker container works fine when I build/run it locally on my machine, but when the Kubernetes executor runs it, the conda environment doesn't have the required packages installed for some reason.
The Docker image gets generated from this Dockerfile:
FROM ubuntu:focal
SHELL ["/bin/bash", "-l", "-c"]
RUN apt-get update && apt-get install -y wget
# Install mamba
RUN wget -q -P /root/ https://github.com/conda-forge/miniforge/releases/latest/download/Mambaforge-Linux-x86_64.sh
RUN sh /root/Mambaforge-Linux-x86_64.sh -b
RUN /root/mambaforge/bin/conda shell.bash hook > /root/activatehook.sh
# Create an environment and install curl and numpy
RUN source /root/activatehook.sh && mamba create -n build-env -y -c conda-forge python=3.9
RUN source /root/activatehook.sh && conda activate build-env && mamba install curl numpy
Now if I build that locally, I can run sudo docker run <my image> /bin/bash -c "source /root/activatehook.sh && conda activate build-env && mamba info && mamba list"
, and I see (among other things) that:
build-env
curl
is installednumpy
is installedNow I move that into my gitlab CI script:
stages:
- test-stage
test-job:
stage: test-stage
tags:
- kubernetes
image: <my-image>
script:
- /bin/bash -c "source /root/activatehook.sh && conda activate build-env && mamba info && mamba list"
When this runs, the output from gitlab indicates that:
build-env
curl
is installednumpy
is not installed!I can't figure out where to go with this. The conda environment exists and is active, and one of the packages in it is properly installed, but the other is not. Furthermore, when I pull the image to my local host and run the same command manually, both curl
and numpy
are installed as expected!
Also important: I am aware of the mambaforge docker image. I have tried something like this:
FROM condaforge/mambaforge
RUN mamba create -y --name build-env python=3.9
RUN mamba install -n build-env -y -c conda-forge curl numpy==1.21
In this case, I get a similar result, except that, when run from the Kubernetes runner, neither curl nor numpy are installed! If I pull the image to my local host, again, the environment is fine (both packages are correctly installed). Can anyone help explain this behavior?
The issue is that the "source" command cannot be executed. To resolve this problem, you can use ENTRYPOINT in the Dockerfile. You can verify the presence of the numpy library in the mamba list by following these steps.
Firstly you can create entrypoint.sh
file:
#!/bin/bash
source /root/activatehook.sh
conda activate build-env
exec "$@"
Copy entrypoint file into the container. (I have enriched the content of the Dockerfile)
Dockerfile:
FROM ubuntu:focal
SHELL ["/bin/bash", "-c"]
WORKDIR /root/
RUN apt-get update && \
apt-get install -y wget
RUN wget -q https://github.com/conda-forge/miniforge/releases/latest/download/Mambaforge-Linux-x86_64.sh
RUN bash Mambaforge-Linux-x86_64.sh -b -p /root/mambaforge
RUN rm /root/Mambaforge-Linux-x86_64.sh
ENV PATH="/root/mambaforge/bin:$PATH"
RUN conda shell.bash hook > /root/activatehook.sh
RUN source /root/activatehook.sh && \
conda create -n build-env -y -c conda-forge python=3.9 && \
echo "source /root/activatehook.sh && conda activate build-env" >> /root/.bashrc && \
/root/mambaforge/bin/conda install -n build-env -c conda-forge -y curl numpy
COPY entrypoint.sh /root/entrypoint.sh
RUN chmod +x /root/entrypoint.sh
RUN apt-get -y autoclean && \
apt-get -y autoremove && \
rm -rf /var/lib/apt/lists/*
ENTRYPOINT ["/root/entrypoint.sh"]
gitlab CI script:
stages:
- test-stage
test-job:
stage: test-stage
tags:
- kubernetes
image: <your-image>
script:
- /root/entrypoint.sh /bin/bash -c "mamba info && mamba list"