I use docker and docker compose to package scientific tools into easily/universally executable modules. One example is a docker that packages a rather complicated python library into a container that runs a jupyter notebook server; the idea is that other scientists who are not terribly tech-savvy can clone a github repository, run docker-compose up
then do their analyses without having to install the library, configure various plugins and other dependencies, etc.
I have this all working fine except that I'm having issues getting the volume mounts to work in a coherent fashion. The reason for this is that the library inside the docker container handles multiple kinds of datasets, which users will store in several separate directories that are conventionally tracked through shell environment variables. (Please don't tell me this is a bad way to do this--it's the way things are done in the field, not the way I've chosen to do things.) So, for example, if the user stores FreeSurfer data, they will have an environment variable named SUBJECTS_DIR that points to the directory containing the data; if they store HCP data, they will have an environment variable HCP_SUBJECTS_DIR. However, they may have both, either, or neither of these set (as well as a few others).
I would like to be able to put something like this in my docker-compose.yml file in order to handle these cases:
version: '3'
services:
my_fancy_library:
build: .
ports:
- "8080:8888"
environment:
- HCP_SUBJECTS_DIR="/hcp_subjects"
- SUBJECTS_DIR="/freesurfer_subjects"
volumes:
- "$SUBJECTS_DIR:/freesurfer_subjects"
- "$HCP_SUBJECTS_DIR:/hcp_subjects"
In testing this, if the user has both environment variables set, everything works swimmingly. However, if they don't have one of these set, I get an error about not mounting directories that are fewer than 2 characters long (which I interpret to be a complaint about mounting a volume specified by ":/hcp_subjects").
This question asks basically the same thing, and the answer points to here, which, if I'm understanding it right, basically explains how to have multiple docker-compose files that are resolved in some fashion. This isn't really a viable solution for my case for a few reasons:
The only decent solution I've been able to come up with is to ask the users to run a script ./run.sh
instead of docker-compose up
; the script examines the environment variables, writes out its own docker-compose.yml
file with the appropriate volumes, and runs docker-compose up
itself. This also seems somewhat clunky, but it works.
Does anyone know of a way to conditionally mount a set of volumes based on the state of the environment variables when docker-compose up
is run?
You can set defaults for environment variable in a .env
-file shipped alongside with a docker-compose.yml
[1].
By setting your environment variables to /dev/null
by default and then handling this case in the containerized application, you should be able to achieve what you need.
$ tree -a
.
├── docker-compose.yml
├── Dockerfile
├── .env
└── run.sh
version: "3"
services:
test:
build: .
environment:
- VOL_DST=${VOL_DST}
volumes:
- "${VOL_SRC}:${VOL_DST}"
FROM alpine
COPY run.sh /run.sh
ENTRYPOINT ["/run.sh"]
VOL_SRC=/dev/null
VOL_DST=/volume
#!/usr/bin/env sh
set -euo pipefail
if [ ! -d ${VOL_DST} ]; then
echo "${VOL_DST} not mounted"
else
echo "${VOL_DST} mounted"
fi
Environment variable VOL_SRC
not defined:
$ docker-compose up
Starting test_test_1 ... done
Attaching to test_test_1
test_1 | /volume not mounted
test_test_1 exited with code 0
Environment variable VOL_SRC
defined:
$ VOL_SRC="./" docker-compose up
Recreating test_test_1 ... done
Attaching to test_test_1
test_1 | /volume mounted
[1] https://docs.docker.com/compose/environment-variables/#the-env-file