Search code examples
condapython-packagingconda-build

Why does conda install need the specification of dependency channels for own package?


We created the SMEAGOL python library and pushed it to Anaconda.org. https://anaconda.org/GSL_tools/smeagol-bio

When installing with conda install -c gsl_tools smeagol-bio the installation gets stuck and never seems to finish solving environments.

When dependency channels are specified like conda install -c gsl_tools -c conda-forge -c bioconda smeagol-bio, the installation is successfull.

**What could be the cause of this behaviour? **

The meta.yaml looks like this:

{% set name = "smeagol-bio" %}
{% set data = load_setup_py_data() %}

package:
  name: "{{ name|lower }}"
  version: {{ data.get("version") }}

source:
  path: .

build:
  noarch: python  # not OS architecture dependent
  script: python -m pip install --no-deps .

channels:
  - conda-forge
  - bioconda

requirements:
  host:
    - python >=3.8
    - pip >=22.1.2
  run:
    - biopython>=1.79
    - h5py>=3.1.0 
    - keras>=2.4.3
    - numpy>=1.19.2
    - pandas>=1.2.5
    - pip>=22.1.2
    - python>=3.7
    - pytables>=3.6.1
    - pytest>=6.2.4
    - recommonmark>=0.7.1  
    - scikit-learn>=0.24.2
    - scipy>=1.7.0
    - seaborn>=0.11.1
    - setuptools>=57.0.0
    - sphinx>=5.0.0
    - statsmodels>=0.12.2
    - tensorflow>=2.5.0
    - deeplift>=0.6.13.0

The conda package was created with conda build . inside the dir. The noarch/smeagol-bio-0.1.1-pypy_0.tar.bz2 file was then uploaded to anaconda.org. When installing on different machines like MacOS M1, MacOS intel and Linux with conda install -c gsl_tools smeagol-bio installation fails.


Solution

  • This is how it is implemented, unfortunately. Conda packages do not capture dependency channel information in the final metadata and so this must be provided at solve time by the end user.

    This is certainly one of the greatest weaknesses of Conda packaging, especially because compiled packages can be coupled to the specific ABI of the packages used at compile time, which can be channel-specific. One can find a ton of questions on SO that suffer from this, with the -c bioconda vs -c conda-forge -c bioconda being the most frequent trouble point.

    Fortunately, most of this goes away if you commit to a specific channel ecosystem. For example, bioinformatics users typically just commit to Bioconda and so prioritize Conda Forge, then Bioconda in their Conda configuration.