Search code examples
condasnakemake

Cannot use snakemake containerize flag when I'm using already existing named environments


Here is what I tried:

snakemake --use-conda --conda-frontend conda --containerize > dockerfile

This produces the following error:

File "/REDACTED/.conda/envs/snakemake/lib/python3.10/site-packages/snakemake/deployment/containerize.py", line 32, in containerize
    envs = sorted(
  File "/REDACTED/.conda/envs/snakemake/lib/python3.10/site-packages/snakemake/deployment/containerize.py", line 30, in relfile
    return env.file.get_path_or_uri()
AttributeError: 'NoneType' object has no attribute 'get_path_or_uri'

Reproducible example is below:

Main snakefile:

configfile: "conf/config.yaml"
rule all:
    input:
        "output.txt"
include: "rules/test.smk"

test.smk snakefile:

rule test:
    output:
        "output.txt"
    conda:
        "biotools"
    shell:
        """
        python -VV > {output}
        """

Removing conda subcommand from the above works (obviously). Putting a file py38.yaml in env directory in rules folder and replacing conda:"biotools" with conda:"envs/py38.yaml" works. But I specifically want my container to include already named conda environments.

Do I really have to rewrite my workflow to replace all instances of conda:X with conda:envs/X.yaml and plop all my conda environs into .yaml files?


Solution

  • Correct, it only works with YAML definitions. Consider filing a feature request.

    A simple solution is to dump out YAMLs for your environment:

    ## minimal version
    mamba env export -n biotools > biotools.min.yaml
    
    ## full version
    mamba env export -n biotools > biotools.full.yaml
    

    then use one of those instead. There are some recommendations in this thread on best practice for YAML constraints, i.e., you may want to adjust the YAML to be more transportable.

    Note that it is generally discouraged to use named environments in Snakemake pipelines because they undermine reproducibility.