Search code examples
pythonpython-3.xpython-2.7anacondaminiconda

How to aggregate conda environments?


I am trying to provision an already set project and its dependent library.

Both use miniconda to define libraries.

Here is an initial script:

#!/bin/bash

project_home_path=`dirname $( cd "$(dirname "$0")" ; pwd -P )`
source /home/${USER}/miniconda/etc/profile.d/conda.sh
conda env create -f ${project_home_path}/environment.yml > /dev/null 2>&1
conda activate <env-name>
/home/${USER}/miniconda/bin/app.py & echo $! > /tmp/env-name.pid

This did not work because the conda activate <env-name> line failed to activate and make available the required libraries.

After going through the documentation [1], this script was fleshed out.

#!/bin/bash

project_home_path=/home/${USER}/folder
source /home/${USER}/miniconda/etc/profile.d/conda.sh
conda env create --force -f ${project_home_path}/project/environment.yml  > /dev/null 2>&1
conda env create --force -f ${project_home_path}/library/environment.yml > /dev/null 2>&1
conda env export -n <project-env> > /tmp/env.yml
conda env update -n base -f /tmp/env.yml > /dev/null 2>&1
conda env export -n <library-env> > /tmp/env.yml
conda env update -n base -f /tmp/env.yml > /dev/null 2>&1
cd /home/${USER}/folder/library && python setup.py install
cd /home/${USER}/folder/project && python setup.py install

This does perform the aggregation required for production and works, but I am wondering how this can be down better.

[1] https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html


Solution

    1. Create one environment that contains all the libraries and packages you need.
    2. conda env export -n MyOneEnvironment -f everything.yml

    Then for provisioning:

    • conda env create -n TheNewEnvironment -f everything.yml
    • conda activate TheNewEnvironment
    • install whatever you need that isn't a conda package

    After provisioning, the conda activate command needs to be repeated each time you want to run the program in that environment.

    In your second script example, you're creating two environments from yml files, just to re-export the list of installed packages, and then install them into the base environment. So you're messing with a total of three environments.

    If there is a requirement that you must install stuff into the conda base environment, then collect one everything.yml file with everything that's needed, and use conda env update -n base --file everything.yml.
    However, it is a bad idea to pollute the base environment in this way. If you need to install the prerequisites for a project or program, then you should create a dedicated conda environment for that and leave the base environment alone. Then you can install any number of projects and programs into separate environments, without any of them interfering with the others.