Search code examples
pythonanacondadependenciescondanexus

How to freeze conda on a fixed version


I've been asked to look at some dev ops stuff regarding python and I'm a bit stuck. The network I'm working is not internet connected so I've been setting up Nexus repositories to bring in dependencies for docker, java and pypi that the other developers can access and pull down locally. However, they have started using conda more and more and we are on a fixed version on our dev network to match a delivery network.

I'm trying to use nexus' conda repos although every time I try and install something it tries to update everything else, including the python and conda versions which are:

      conda version : 4.8.3
conda-build version : 3.18.11
     python version : 3.8.3.final.0

I've edited my .condarc file to read:

channels:
  - http://master:8041/repository/anaconda-proxy/main/
  - http://master:8041/repository/conda-forge/
remote_read_timeout_secs: 1200.0
auto_update_conda: false
channel_priority: false

However every time i try to install something to cache the dependencies I get an huge list of updates. For example:

conda install cudatoolkit
<snip>
The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    alabaster-0.7.12           |             py_0          16 KB  http://master:8041/repository/anaconda-proxy/main
    anaconda-client-1.7.2      |           py38_0         172 KB  http://master:8041/repository/anaconda-proxy/main
    anaconda-project-0.8.4     |             py_0         210 KB  http://master:8041/repository/anaconda-proxy/main
    argh-0.26.2                |           py38_0          36 KB  http://master:8041/repository/anaconda-proxy/main
.....

Any advice would be great. I've added the auto_update_conda and channel_priority flags but to no avail. Thanks in advance.

Additonal info: I'm a Java developer and I only use a bit of python, so I'm not massively familar with the anaconda setup so apologies if this is simpler than I'm making it.


Solution

  • How Conda Solves

    Conda always first attempts to solve the install directive without changing existing packages (i.e., it runs first with a --freeze-installed flag) and will only proceed to a full solve (what you are seeing) if it can't find any version of your requested package that already has all its dependencies satisfied in the environment. That is, this result implies that what you are asking for is not possible. Or at least not via the CLI if you want a valid environment.1

    At the core of the issue is that even if there is only a single dependency that needs updating, there is no intermediate mode to indicate that you want to minimize the total number of changes (which I think would actually be a nice enhancement). Conda only has two solving modes:

    1. Change nothing else (--freeze-installed).
    2. All dependencies are allowed to update (--update-deps).

    The exception to this are the aggressive_update_packages and the auto_update_conda, which it will always attempt to update whenever the environment is mutated. But it seems you've already realized those can be disabled through configuration settings.2

    Manual Dependency Updating

    This doesn't mean what you are hoping to accomplish is impossible, but that there isn't a clean way to automate it via the CLI. Instead, you might need to manually track down the dependencies that need updating (e.g., conda search cudatoolkit --info), update them first (conda install with specific versions), and then try installing your package again. I would strongly recommend first settling on the exact version of cudatoolkit you plan to install, otherwise conda search cudatoolkit --info will be too much info.

    Package Pinning

    For packages that you really do want absolutely fixed there is package pinning. You could do this for conda, python, and other core packages.

    Base Environment

    I find it a bit odd that the base environment (the one that has the conda package) is being mutated at all. Instead, I would expect software engineers to always use non-base environments for development and production. It is easy to create new environments, one can define them with version controlled YAML files, use them modularly by creating them on a per project or per task-type basis, and they can be mutated without worrying about affecting the Conda infrastructure. However, I'm not entirely clear on your setup, so this comment may not apply.


    [1] If one doesn't care about validity (probably not a good idea for production) then there is always the --no-deps flag.

    [2] The default aggressive_update_packages packages are ones that frequently become vulnerable to exploits (e.g., openssl), so carefully consider the implications of leaving them outdated.