Search code examples
anacondacondavirtual-environment

Modiyfing conda configuration file does not reflect changes in environment


I am trying to change the default installation location for Conda environments because the system I am using (a supercomputing cluster) has a ~20GB user home quota. Under normal circumstances, this could easily be done by editing ~/.condarc and adding a portion envs_dirs, which is explained quite well in this question and answer.

However, it seems that the compute environment I am in (i.e., with the supercomputer), does not let me modify the priority of various locations for environments. In an ideal world, I would be able to place /work/helikarlab/joshl/.conda/envs at the top of the list, which is a high-storage partition, so I can install additional environments if needed.

My ~/.condarc is configured as follows:

env_prompt: ({name})
channels:
  - conda-forge
  - bioconda
  - defaults
auto_activate_base: false
envs_dirs:
  - /work/helikarlab/joshl/.conda/envs/

Yet, I observe the following entries with conda config --show envs_dirs

envs_dirs:
  - /home/helikarlab/joshl/.conda/envs
  - /util/opt/anaconda/deployed-conda-envs/packages/python/envs
  - /util/opt/anaconda/deployed-conda-envs/packages/perl/envs
  - /util/opt/anaconda/deployed-conda-envs/packages/git/envs
  - /util/opt/anaconda/deployed-conda-envs/packages/nano/envs
  - /work/helikarlab/joshl/.conda/envs
  - /home/helikarlab/joshl/.conda/envs/base_env/envs

Does anyone know why my attempt set envs_dirs is not working? How can I set the /work/helikarlab/joshl/.conda/envs to the highest priority?


Additional Info

Here is the result from conda config --show-sources

==> /util/opt/anaconda/4.9.2/.condarc <==
allow_softlinks: False
auto_update_conda: False
auto_activate_base: False
notify_outdated_conda: False
repodata_threads: 4
verify_threads: 4
execute_threads: 2
aggressive_update_packages: []
pkgs_dirs:
  - ${WORK}/.conda/pkgs
  - ${HOME}/.conda/pkgs
channel_priority: disabled
channels:
  - hcc
  - https://conda.anaconda.org/t/<TOKEN>/hcc
  - conda-forge
  - bioconda
  - defaults
  - file:///util/opt/conda_repo

==> /home/helikarlab/joshl/.condarc <==
auto_activate_base: False
env_prompt: ({name})
envs_dirs:
  - /work/helikarlab/joshl/.conda/envs/
channel_priority: disabled
channels:
  - conda-forge
  - bioconda
  - defaults

==> envvars <==
envs_path:
  - /home/helikarlab/joshl/.conda/envs
  - /util/opt/anaconda/deployed-conda-envs/packages/python/envs
  - /util/opt/anaconda/deployed-conda-envs/packages/perl/envs
  - /util/opt/anaconda/deployed-conda-envs/packages/git/envs
  - /util/opt/anaconda/deployed-conda-envs/packages/nano/envs

Solution

  • Background: Conda's configuration priorities

    As documented in "The Conda Configuration Engine for Power Users" post, Conda sources configuration values from four sources, listed from lowest to highest priority:

    1. Default values in the Python code
    2. .condarc configuration files (system < user < environment < working directory)
    3. Environment variables (CONDA_* variables)
    4. Command-line specifications

    Problem: Environment variable prioritized

    We can observe how this plays out in OP's case, with the --show-sources result. Specifically, there are three places where envs_dirs is defined:

    1. System level configuration file at /util/opt/anaconda/4.9.2/.condarc
    2. User-level configuration file at /home/helikarlab/joshl/.condarc
    3. Environment variable CONDA_ENVS_PATH1

    And since the environment variable takes priority and defines the preferred directory to be /home/helikarlab/joshl/.conda/envs, that will take precedence no matter what is set with conda config and .condarc files.

    Workarounds

    All the following workarounds involve manipulating the environment variable. It is unclear when the variable is set (probably via a system-level shell configuration file). It should be reliable to manipulate the variable by appending any of the following workarounds to user-level shell configuration file (e.g., ~/.bashrc, ~/.bash_profile, ~/.zshrc).

    Option 1: Unset variable

    One could completely remove the variable with

    unset CONDA_ENVS_PATH
    

    This would then allow the user-level .condarc to take priority.

    However, this variable also appears to provide locations for several system-level shared environments. It is unclear how integral these shared environments are for normal functionality. So, removing the variable altogether could have additional consequences.

    Option 2: Replace value

    Conveniently, the location default and desired locations differ only by replacing /home with /work. This could be changed directly in the variable with:

    export CONDA_ENVS_PATH=${CONDA_ENVS_PATH/\/home/\/work}
    

    Option 3: Prepend desired default

    The most general override would be to prepend the desired default path to the environment variable:

    export CONDA_ENVS_PATH="/work/helikarlab/joshl/.conda/envs/:${CONDA_ENVS_PATH}"
    

    This is probably the most robust, since it assumes nothing about the inherited value.

    Additional Note

    Users with small disk quotas in default locations should also consider moving the package cache (pkgs_dirs) to coordinate with the environments directory. Details in this answer.


    [1]: CONDA_ENVS_DIRS and CONDA_ENVS_PATH are interchangeable, however only one can be defined at time. The former is the contemporary usage, so I believe the latter is likely supported for backward compatibility.